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and gene probes and more specifically to such proteins and 
25 probes derived from an enterically transmitted nonA/nonB 
hepatitis viral agent, to diagnostic methods and vaccine 
applications which employ the proteins and probes, and to gene 
segments that encode specific epitopes (and proteins 
artificially produced to contain those epitopes) that are 
30 particularly useful in diagnosis and prophylaxis. 

Background 



agent (ET-NANB; also referred to herein as HEV) is the reported 
35 cause of hepatitis in several epidemics and sporadic cases in 
Asia, Africa, Europe, Mexico, and the Indian subcontinent. 
Infection is usually by water contaminated with feces, although 
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INTRODUCTION 



Field of Invention 



This invention relates to recombinant proteins, genes, 



Enterically transmitted non-A/non-B hepatitis viral 



1. 



the 




us may also spread by clos 




ysical contact. 



The virus does not seem to cause chronic infection. 



infection of volunteers with pooled fecal isolates; 
immune electron microscopy ( I EM ) studies have shown 
virus particles with 27-34 nm diameters in stools 
from infected individuals. The virus particles reacted 
with antibodies in serum from infected individuals 
from geographically distinct regions, suggesting that 
a single viral agent or class is responsible for the 
majority of ET-NANB hepatitis seen worldwide. No 
antibody reaction was seen in serum from individuals 
infected with parenterally transmitted NANB virus 
(also known as hepatitis C virus or HCV), indicating 
a different specificity between the two NANB types. 



two types of NANB infection show distinct clinical 
differences. ET-NANB is characteristically an acute 
infection, often associated with fever and arthralgia, 
and with portal inflammation and associated bile 
stasis in liver biopsy specimens (Arankalle) . 
Symptoms are usually resolved within six weeks. 
Parenterally transmitted NANB, by contrast, produces a 
chronic infection in about 50% of the cases. Fever and 
arthralgia are rarely seen, and inflammation has a 
predominantly parenchymal distribution (Khuroo, 1980). 
The course of ET-NANBH is generally uneventful in 
healthy individuals, and the vast majority of those 
infected recover without the chronic sequelae seen 
with HCV. One peculiar epidemiologic feature of this 
disease, however, is the markedly high mortality 
observed in pregnant women; this is reported in 
numerous studies to be on the order of 10-20%. This 
finding has been seen in a number of epidemiologic 
studies but at present remains unexplained. Whether 
this reflects viral pathogenicity, the lethal 
consequence of the interaction of virus and immune 
suppressed (pregnant) host, or a reflection of the 



The viral etiology in ET-NANB has been demonstrated by 



In addition to serological differences, the 
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):^^:ated prenatal health of a ^(^<= 



deb^^ated prenatal health of a s^^eptible 

malnourished population remains to be clarified. 

The two viral agents can also be distin- 
guished on the basis of primate host susceptibility. 
5 ET-NANB, but not the parenterally transmitted agent, 
can be transmitted to cynomolgus monkeys. The 
parenterally transmitted agent is more readily 
transmitted to chimpanzees than is ET-NANB (Bradley, 
1987 ) . 

10 There have been major efforts worldwide to 

identify and clone viral genomic sequences associated 
with ET-NANB hepatitis. One goal of this effort, 
requiring virus-specific genomic sequences, is to 
identify and characterize the nature of the virus and 

15 its protein products . Another goal is to produce 
recombinant viral proteins which can be used in 
antibody-based diagnostic procedures and for a 
vaccine. Despite these efforts, viral sequences 
associated with ET-NANB hepatitis have not been 

20 successfully identified or cloned heretofore, nor have 
any virus-specific proteins been identified or 
produced . 

Relevant Literature 
25 Arankalle, V.A. , et al., The Lancet, 550 

(March 12 , 1988 ) . 

Bradley, D.W., et al . , J Gen. Virol., 69:1 

( 1988) . 

Bradley, D.W. et al., Proc . Nat. Acad. Sci., 
30 USA, 84:6277 (1987). 

Gravelle, C.R. et al . , J. Infect. Diseases, 
131 : 167 ( 1975) . 

Kane, M.A., et al . , JAMA, 252:3140 (1984). 
Khuroo, M.S., Am. J. Med. , 48:818 (1980). 
35 Khuroo, M.S., et al . , Am. J. Med., 68:818 

( 1983 ) . 



20309587 
040491 



Maniatis, T., e: al . Mo ^ilar Cloning; A 
Laboratory Manual , Cc Id Spring Harbor Laboratory 
( 1982 ) . 

Seto, B., e: al., Lancet, 11:94 1 ( 1984 ). 
5 Sreenivasan, M.A., et ai . , J. Gen. Virol., 

65 : 1005 ( 1934 ) . 

Tabor, E., et al . , J. Infect. Dis . , 140 : 789 

( 1979 ) . 

10 SUMMARY CF THE INVENTION 

Novel compositions, as well as methods of 
preparation and use of the compositions are provided, 
where the compositions comprise viral proteins and 
fragments thereof derived from the viral agent for ET- 

15 NANB . A number of specific fragments of viral proteins 
(and the corresponding genetic sequences) that are 
particularly useful in diagnosis and vaccine 
production are also disclosed. Methods for preparation 
of ET-NANB viral proteins include isolating ET-NANB 

20 genomic sequences which are then cloned and expressed 
in a host cell. The resultant recombinant viral 
proteins find use as diagnostic agents and as 
vaccines. The genomic sequences and fragments thereof 
find use in preparing ET-NANB viral proteins and as 

25 probes for virus detection. 



BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1 shows vector constructions and 
manipulations used in obtaining and sequencing cloned 
3 0 ET-NANB fragment; and 

Figures 2A-2B are representations of 
Southern blots in which a radiolabeled ET-NANB probe 
was hybridized with amplified cDNA fragments prepared 
from RNA isolated from infected (I) and non-infected 
35 (N) bile sources (2A), and from infected (I) and non- 

infected (N) stool-sample sources (2B). 
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4^ DESCRIPTION OF SPECIFIC ^TODIMENTS 

Novel compositions comprising generic 
sequences and fragments thereof derived from the viral 
agent for ET-NANB are provided, together with 
5 recombinant viral proteins produced using the genomic 
sequences and methods of using these compositions. 
Epitopes on the viral protein have been identified 
that are particularly useful in diagnosis and vaccine 
production. Small peptides containing the epitopes are 
10 recognized by multiple sera of patients infected with 
ET-NANB. 

The molecular cloning of HEV was accomp- 
lished by two very different approaches. The first 
successful identification of a molecular clone was 
15 based on the differential hybridization of putative 
HEV cDNA clones to heterogeneous cDNA from infected 

and uninfected cyno bile. cDNAs from both sources 

3 2 

were labeled to high specific activity with P to 
identify a clone that hybridized specifically to the 

20 infected source probe, A cyno monkey infected with 
the Burma isolate of HEV was used in these first 
experiments. The sensitivity of this procedure is 
directly related to the relative abundance of the 
specific sequence against the overall background. In 

25 control experiments, it was found that specific 

identification of a target sequence may be obtained 
with as little as 1 specific part per 1000 background 
sequences. A number of clones were identified by this 
procedure using libraries and probes made from 

30 infected (Burma isolate) and control uninfected cyno 

bile. The first extensively characterized clone of 
the 16 plaques purified by this protocol was given the 
designation ET1 . 1 . 

ET1.1 was first characterized as both 

35 derived from and unique to the infected source cDNA. 
Heterogeneous cDNA was amplified from both infected 
and uninfected sources using a sequence independent 
single premier amplification technique (SISPA). This 

20309587 
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:i^0^ue is described in ccpendii^^f 



teCi ^P ue is descriced in ccpendiri^^ppiicat ion serial 
No. 208,512, filed June 17, 1983. The limited pool of 
cCNA made from Burma infected cyno bile could then be 
amplified enzymat ically prior to cloning or 
hybridization using putative HEV clones as probes. 
ET1.1 hybridised specifically to the original bile 
cDNA from the infected source. Further validation of 
this clone as derived from the genome of HEV was 
demonstrated by the similarity of the ET1.1 sequence 
and those present in SISPA cDNA prepared from five 
different human stool samples collected from 
different ET-NANBH epidemics including Somalia, 
Tashkent, Borneo, Mexico and Pakistan. These 
molecular epidemiologic studies established the 
isolated sequence as derived from the virus that 
represented the major cause of ET-NANBH worldwide. 

The viral specificity of ET1.1 was further 
established by the finding that the clone hybridized 
specifically to RNA extracted from infected cyno 
liver. Hybridization analysis of polyadenylated RNA 
demonstrated a unique 7.5 Kb polyadenylated 
transcript not present in uninfected liver. The size 
of this transcript suggested that it represented the 
full length viral genome. Strand specific 
oligonucleotides were also used to probe viral genomic 
RNA extracted directly from semi-purified virions 
prepared from human stool. The strand specificity was 
based on the RNA-directed RNA polymerase (RDRP) open 
reading frame (ORF) identified in ET1.1 (see below). 
Only the probe detecting the sense strand hybridized 
to the nucleic acid. These studies characterized HEV 
as a plus sense, single stranded genome. Strand 
specific hybridization to RNA extracted from the liver 
also established that the vast majority of 
intracellular transcript was positive sense. Barring 
any novel mechanism for virus expression, the negative 
strand, although net detectable, would be present at a 
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:i^^>£ less than 1:100 when comj^^c 



ratJ^ptf less than 1:100 when compBBed with the sense 
strand . 

ET1.1 was documented as exogenous when 
tested by both Southern blot hybridization and PCR 
5 using genomic DNAs derived from uninfected humans, 
infected and uninfected cynos and also the genomic 
DNAs from E^ coli and various bacteriophage sources. 
The latter were tested in order to rule out trivial 
contamination with an exogenous sequence introduced 

10 during the numerous enzymatic manipulations performed 
during cDNA construction and amplification. It was 
also found that the nucleotide sequence of the ET1 . 1 
clone was not homologous to any entries in the 
Genebank database. The translated open reading frame 

15 of the ET1.1 clone did, however, demonstrate limited 

homology with consensus amino acid residues consistent 
with an RNA-directed RNA polymerase. This consensus 
amino acid motif is shared among all positive strand 
RNA viruses and, as noted above, is present at the 3' 

2 0 end of the HCV genome. The 1.3 Kb clone was therefore 
presumed to be derived, at least in part, from the 
nonstructural portion of the viral genome. 

Because of the relationship of different 
strains of ET-NANB to each other that has been 

25 demonstrated by the present invention, the genome of 
the ET-NANB viral agent is defined in this 
specification as containing a region which is 
homologous to the 1.33 kb DNA EcoRI insert present in 
plasmid pTZKFl (ET1.1) carried in E_;_ coli strain BB4 

30 and having ATCC deposit no. 67717. L-The entire 

sequence, in both directions, has now been identified 
as set forth below. The sequences of both strands are 
provided, since both strands can encode proteins. 
However, the sequence in one direction has been 

35 designated as the "forward" sequence because of 

statistical similarities to known proteins and because 
the forward sequence is known to be predominately 
protein-encoding. This sequence is set forth below 
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^J^with the three possible trs^a 



alol^^with the three possible translation sequences. 
There is one long open redoing frarr.e that starts at 
nucleotide 14 5 with an i so leucine and extends to the 
end of the sequence. The two other reading frames have 
5 many termination codcns . Standard abbreviations for 
nucleotides and amino acids are used here and 
elsewhere in this specification. 

The gene sequence given below is 
substantially identical to one given in the parent 
10 application. The present sequence differs in the 

omission of the first 3" nucleotides at the 5' end and 
last 13 nucleotides at the 3' end, which are derived 
from the linker used for cloning rather than from the 
virus. In addition, a G was emitted at position 227 
15 of the sequence given in the parent application. 

The following gene sequence has SEQ ID NO.l; 
the first amino acid sequence in reading frame 
beginning with nucleotide 1 has SEQ ID NO . 2 ; the 
second amino acid sequence in reading frame beginning 
20 with nucleotide 2 has SEQ ID NO . 3 ; and the third amino 
acid sequence in reading frame beginning with 
nucleotide 3 has SEQ ID NO . 4 . 
Forward Sequence 

SEQ IP NO . I : 

25 

AGACCTGTCC CTG77GCAGC TGTTC7-c:a CCC7GCCCCG AGC7CGAACA GGGCC77C7C 60 

7acc7gcccc aggagc7cac cacc7g7ga7 ~g~;7:g7aa ca777gaa77 aacagaca77 120 

30 g7gcac7gcc gca7ggccgc cccgagccag ggcaaggccg 7gc7gtccac ac7cg7gggc 180 

cgc7acggcg g7cgcacaaa gxc~-c-- 3c"cxac t ctgatgttcg cgac7c7c7c 240 

gcccgtttta 7cc:gg::-7 'G3i:::g'- :^gg"^:-~ ;"G7gaa~7 g7acgagc7a bco 

35 

G7GGAGGCCA 7ou7CGAG-A GGGCl- GGC"CGCG3 7CC77GAGC7 7GA7C777GC 360 

AACCG7GACG 'G7CCAGGA" C^c:": I :^GiiAGA7' G7AAC AAGTT CACCACAGG7 420 

40 GAGACCA77G CCCA'GG'-- - 3"GG * ~- j GGZ-'C'CGG CC'GGAGCAA GACC77C7GC 480 

GCCC7C77 T G GCw^< i cu . . ^j.Gl - jAGAm OJu .a 7T73GCCCT GC7CCC7CAG 540 

GG7G7G7 T 7" ACGG'G-'GG :""G-']-: -C:G": tt C CGGCGGC7G7 GGCCGCAGCA 600 

45 
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35 



50 



'GCATCCA 7buloi;:^ JAA ^ . ,.,^u:,:u -^wTCACCCA GAATAACTTT 660 

TC^CTGGGTC TAGAG'3'G: 'ATTA^GGAG GAGTG'GGGA TGCCGCAGTG GCTCATCCGC 720 

CTGTATCACC TTA'AAGGTC 'GCQTGGA'C TTGCAGGCCC CGAAGGAGTC TCTGCGAGGG 780 

TT—GGAAGA AACACTCCGG "GAGCCC3GC ACTCTTCTAT GGAATACTGT CTGGAATATG 840 

GCCGTTAT*!\A CC:aCTGTTA ~GAC"CCGC GATTT'CAGG tggctgcctt TAAAGGTGAT 900 

GATTCGATAG TGCTTTGCAG TGAGTATCGT CAGAGTCCAG GAGCTGCTGT CCTGATCGCC 960 

GGCTGToGCT tgaagttgaa GGTAGATTTC CGCCCGATCG GTTTGTATGC AGGTGTTGTG 1020 

15 GTGGCCCCCG 3CCTTGGCGC GCTCCCTGAT GTTGTGCGCT TCGCCGGCCG GCTTACCGAG 1080 

AAGA.AT7GGG GCC"GGCCC TGAGCGGGCG GAGCAGCTCC GCCTCGCTGT TAGTGATTTC 1140 

CTCCGCAAGC TCACGAATGT AGCTCAGATG TGTGTGGATG TTGTTTCCCG TGTTTATGGG 1200 

20 

GTTTCCCCTG GACTCGT7CA TAACCTGATT GGCATGCTAC AGGCTGTTGC TGATGGCAAG 1260 

GCACATTTCA CTGAGTCAGT AAAACCAGTG CTCGA 1295 
25 SEQ ID NO. 2 : 

Arg Pro Val Pro Val Ala Ala Val Leu Pro Pro Cys Pro Glu Leu Glu 
15 10 15 

30 Gin Gly Leu Leu Tyr Leu Pro Gin Glu Leu Thr Thr Cys Asp Ser Val 

20 25 30 

Val Thr Phe Glu Leu Thr Asp He Val His Cys Arg Met Ala Ala Pro 
35 40 45 



Ser Gin Arg Lys Ala Val Leu Ser Thr Leu Val Gly Arg Tyr Gly Gly 
50 55 60 



Arg Thr Lys Leu Tyr Asn Ala Ser His Ser Asp Val Arg Asp Ser Leu 
40 65 70 75 80 

Ala Arg Phe He Pro Ala He Gly Pro Val Gin Val Thr Thr Cys Glu 
35 90 95 

45 Leu Tyr Glu Leu Val Glu Ala M et Val Glu Lys Gly Gin Asp Gly Ser 

IOC 105 no 

Ala Val Leu Glu Leu Asa Leu Cys Asn Arg Asp Val Ser Arg He Thr 
115 120 125 



Phe Phe Gin Lys Asc Cys Asn Lys Phe ~hr Thr Gly Glu Thr He Ala 
130 ' 135 140 



His Gly Lys Val Gly Gin Gly He Ser Ala Trp Ser Lys Thr Phe Cys 
55 145 150 155 160 
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45 



_eu ? H e j'v "2 -"e - t . e : _ ^sj^^Ala lie Leu Ala 

:55 ' ho " 175 

Leu _eu ?r c 3^ H< -3' " . - 3'. -'a ~'"e -sp Asd T hr V a 1 

5 " " ISO ' ' H5 190 

3 he Ser A:a A"i a ,'a' i -'a -'a -'a Se^ M e: Val Phe Glu Asn 

195 Zi: ' 2C5 

10 A ss ?he Se ^ Glu Dr .e As- A3- ~sr. -*e Se^ Leu Gly Leu 

210 215 220 

Glu Cys Ala :ie M e: 3"u GH Hs GH M et Pro Gin Trp Leu He Arg 
225 220 ' * 225 240 

15 

Leu Tyr His Leu He Arg 50^ ~ : a He _eu 31 n Ala Pro Lys Glu 
2AE 250 255 

Ser Leu Arg 31 y Phe ~rp Lys ,,i Mi s Ser Sly 3<u ? ro Gly Thr Leu 
20 ^ 260 255 270 

Leu Trp Asn Thr Va 1 T rp Asn M e: A:a Val He Thr His Cys Tyr Asp 
275 ISO 2S5 

25 Phe Arg Asp Phe Gin Val Ala Ala Phe ^ yS Gly Asd Asp Ser He Val 

290 295 300 

Leu Cys Ser Glu Tyr Arg Gin Ser ~rz Sly Ala Ala Val Leu He Ala 
305 ' 310 315 320 



Gly Cys Gly Leu Lys Leu Lys Val Aso -he Arg Pro lie Gly Leu Tyr 
325 330 335 



Ala Gly Val Val Val Ala Pre Sly Leu Gly Ala Leu Pro Asp Val Val 

35 340 345 350 

Arg Phe Ala Gly Arg Leu Thr 31u Lys Asn Trp Gly Pro Gly Pro Glu 
355 ' 350 ' 365 

40 Arg Ala Glu Gin Leu Arg Leu Ala Val Ser Asp Phe Leu Arg Lys Leu 

370 375 380 

Thr Asn Val Ala Gin Vet Cys Va' isc Val Val Ser Arg Val Tyr Gly 

385 390 395 400 



Val Ser p ro Gly _eu val * ; s -s" _eu He Gly M et Leu Gin Ala Val 
4C5 410 415 



Ala Asd Gly Lys Ala - - s P h e ~-r GH Se^ ,al Lys ?ro Val Leu 
50 420 425 430 

SEQ ID NO. 3 : 

Asd Leu Se^ Leu Leu 3'" Leu ?l ~e ~y Ji s "ro Ala ?ro Ser Ser Asn 
55 1 5 10 15 
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A 1 3 - h e Se r ~hr Cys Arg Ser Ser Pro^ro Val He Val Ser 

20 25 30 

His lj Asn . Gin *hr _eu Cys ~hr Ala Ala Trp Pro Pro Arg 
5 35 40 45 

Ala Se"- Arg -ro Zys Cys 3 ro ~> ■ s 5e^ Trp Ala Ala Thr Ala Val 
50 55 50 

10 Ala j'n Se" Ce^ ~hr M et Leu 2-3 Thr Leu Met Phe Ala Thr Leu Ser 

65 "0 75 80 

Pro Va 1 _eu Ze^ Arg *ro Leu - 7 a Pro T yr Arg Leu Gin Leu Val Asn 
35 90 95 

15 

Cys Thr Ser . "rp Arg Pro ~ro Ser Arg Arg Ala Arg Met Ala Pro 

:cc :c5 no 

Pro Ser Leu Ser Leu lie Phe Ala Thr Val Thr Cys Pro Gly Ser Pro 
20 115 120 125 

Ser Ser Arg Lys He Val Thr Ser Ser Pro Gin Val Arg Pro Leu Pro 
130 135 140 

25 Met Val Lys Trp Ala Arg Ala Ser Arg Pro Gly Ala Arg Pro Ser Ala 

145 150 155 160 

Pro Ser Leu Ala Leu Gly Ser Ala Leu Leu Arg Arg Leu Phe Trp Pro 
165 170 175 



30 



Cys Ser Leu Arg Val Cys Phe Thr Val Met Pro Leu Met Thr Pro Ser 
180 185 190 



Ser Arg Arg Leu Trp Pro Gin Gin Arg His Pro Trp Cys Leu Arg Met 
35 195 200 205 

Thr Phe Leu Ser Leu Thr Pro Pro Arg He Thr Phe Leu Trp Val 
210 215 220 

40 Ser Val Leu Leu Trp Arg Ser Val Gly Cys Arg Ser Gly Ser Ser Ala 

225 230 235 240 

Cys lie Thr Leu . Gly Leu Arg Gly Ser Cys Arg Pro Arg Arg Ser 
245 250 255 

45 

Leu Cys Glu Gly Phe Sly Arg Asn Thr Pro Val Ser Pro Ala Leu Phe 
250 265 270 

Tyr Gly He Leu Se^ Gly He *Tro P^o Leu Leu Pro Thr Val Met Thr 
50 275 250 285 

Ser Ala He Phe Arg T rc Leu P^o Leu Lys Val Met He Arg . Cys 

290 295 300 

55 Phe Ala Val Ser He va' ^ r g /a' 31 n Glu Leu Leu Ser . Ser Pro 

305 512 315 320 
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* 



la 7a* Ala 



e ^e- Ala Arg Ser Va 1 Cys Met 
330 335 



Gin 7a 1 Leu Tr 



A' a _eu ^3 Arg 5e° Leu Met Leu Cys 



10 



Ala Se^ =ro A'a jly -e- Arg Arg ;'e Gly A^a Leu Ala Leu Ser 
355 360 365 



31 y A r g Ser Se r 
370 



- ■ e Se 1 " Ser Ala Ser Ser 



Arg Me: . Leu -rg I. s ,'al M e: .e. -he p-c Va; ?he Met Gly 

335 39: 395 400 

Phe Pro Leu Aso Ee" : '"e lie ~' n r . _eu Ala Cys Tyr Arq Leu Leu 

AC5 Ji: ' 415 



20 



Leu Met Ala Arg H^s I-e 5e^ Leu 5 e ^ Sin . Asn Gin Cys Ser 
420 -25 430 



SEQ ID NO. 4 : 



25 



Thr Cys Pro Cys Cys Se^ Cys Ser Thr "hr Leu Pro Arg Ala Arg Thr 
1 5 10 15 



30 



Gly Pro Ser Leu Pro Ala Pro Gly Ala J is His Leu . . Cys Arg 

20 25 30 

Asn He . lie Asn Arg His Cys Ala Leu Pro His Gly Arg Pro Glu 

35 40 45 



35 



Pro Ala Gin Gly Arg Ala Va 1 His Thr Arg Gly Pro Leu Arg Arg Ser 
50 55 60 



His Lys Ala Leu Gin Cys Phe Pro Leu . Cys Ser Arg Leu Ser Arg 
65 70 75 80 



40 



Pro Phe Tyr Pro Gly His Trp Pro Arg ~hr Gly Tyr Asn Leu . lie 
85 90 95 



45 



Val Arg Ala Ser Gly Gly His Gly - r g Glu Gly Pro Gly Trp Leu Arg 
100 105 110 



Arg Pro . Ala . Se^ _eu Glr 



Arg Vai Gin Asp His Leu 
125 



50 



Leu Pro Glu Arg Leu . Glr, Val His H i s Arg . Asp His Cys Pro 
130 135 140 



Trp . Ser Gly Pro Gly His Leu Gly Leu Glu Gin Asp Leu Leu Arg 
1A5 150 155 160 



Pro Leu r ro Pro Leu Val 2 ^o Arg ~.r . Glu Gly Tyr Ser Gly Pro 
155 170 175 
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25 



40 



Ala P'-o Se^ j iy lys .3' _eu . : y s Leu . . His Arg Leu 

ISO 135 190 

Leu G'y Gly Cys 3', 5e- _,s Yy Ye -is Gly 7a 1 . Gl u . 

135 ::: 205 

Leu -he . . _eu -'S Yj . Leu Phe Ser Gly Ser Arg 

213 21: 220 

Val Cys Tyr Tyr Yy 3'y ;al r rc Aso Ala Ala Val Ala His Pro Pro 
225 220 235 240 



Val Ser ?rc ~yr Lys Jz 1 Cys Val Asd Leu Ala Gly Pro Glu Gly Val 
15 2^5 250 255 

Ser Ala Arg Val Leu Yj Glu ~hr lcj Arg . Ala Arg His Ser Ser 
250 255 270 

20 Met Glu Tyr Cys Leu Glu T yr Gly Arg Tyr Tyr Pro Leu Leu . Leu 

275 230 285 

Pro Arg Phe Ser Gly Gly Cys Leu . Arg . . Phe Asp Ser Ala 
290 295 300 



Leu Gin . Val Ser Ser Glu Ser Arg Ser Cys Cys Pro Asp Arg Arg 
305 310 315 320 



Leu Trp Leu Glu Val Glu Gly Arg Phe Pro Pro Asp Arg Phe Val Cys 

30 325 330 335 

Arg Cys Cys Gly Gly Pro Arg Pro Trp Arg Ala Pro . Cys Cys Ala 

340 345 350 

35 Leu Arg Arg Pro Ala Tyr Arg Glu Glu Leu Gly Pro Trp Pro . Ala 
355 360 365 



Gly Gly Ala Ala Pro Pro Arg Cys . . Phe Pro Pro Gin Ala His 

370 375 380 

Glu Cys Ser Ser Asp Val Cys Gly Cys Cys Phe Pro Cys Leu Trp Gly 

385 390 395 400 



Phe Pro Trp Thr Arg Ser . Pro Asp Trp His Ala Thr Gly Cys Cys 
45 ^05 410 415 

. Trp Gin Gly T hr Phe yi s . Val Ser Lys Thr Ser Ala Arg 
420 425 430 

50 The complementary strand, referred to here 

as the "reverse sequence," is set forth below in the 
same manner as the forward sequence set forth above. 
Several open reading frames, shorter than the long 
open reading frame found in the forward sequence, can 
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be ^^fn in this reverse sequence.~ecause of the 

relative brevity of the open reading frames in the 
reverse direction, they are probably not expressed, 

The following gene sequence has SEQ ID NO . 5 . 
Reverse Seauence 



ATGCIAA7CA GG"-1~~: 3-G*::-333 lAAACIZC-^ AAACACGGGA AACAACATCC 120 

ACACACA7..7 jA.jC.A;_~. _G j-GG"-^ jG~G.:AAA"" CAC7AACAGC GAGGCGGAGC 180 

TGCTCCGCwC GC . v-AGo-ju C -ucj A A ^"".CTCGG 'AAGCCGGCC GGCGAAGCGC 240 

ACAACA7CAG GGAGCGCGCC -AGGCCGGGG GCCACCACAA CACCTGCATA CAAACCGATC 300 

GGGCGGAAAT CTACCTCAA CT'CAAGCCA CAGCCGGCGA TCAGGACAGC AGCTCCTGGA 360 

CTCTGACGAT ACTCACTGCA AAGCAC7A7C GAATCATCAC CTTTAAAGGC AGCCACCTGA 420 

AAATCGCGGA AGTCATAACA G7GGG7AA7A ACGGCCATAT TCCAGACAGT ATTCCATAGA 480 

AGAGTGCCGG GCTCACCGGA GTGT'TCT'C CAAA-CCC7C GCAGAGACTC CTTCGGGGCC 540 

TGCAAGATCC ACGCAGACC! 7A7AAGG7GA 7ACAGGCGGA TGAGCCACTG CGGCATCCCA 600 

CACTCCTCCA TAATAGCACA CTCTAGACCC AGAGAAAAGT TATTCTGGGT GGAGTCAAAC 660 

TCAGAAAAGT CATTCTCAAA CACCATGGAT GCC'TTGCTG CGGCCACAGC CGCCGAGAAG 720 

ACGGTGTCAT CAAAGGCA7C ACCGTAAJAC ACACCCTGAG GGAGCAGGGC CAGAATAGCC 780 

77C7CAA7AG CGCGGAACCA AGGGCCAAAG AGGGCGCAGA AGGTCTTGCT CCAGGCCGAG 840 

ATGCCCTGGC CCACTTTACC ATGGGCAATG GTCTCACCTG TGGTGAACTT GTTACAATCT 900 

TTCTGGAAGA AGGTGATCCT GGACACGTCA CGGTTGCAAA GATCAAGCTC AAGGACGGCG 960 

GAGCCATCCT GGCCCTTCTC GACCATGGCC 'CCACTAGCT CGTACAATTC ACAAGTTGTA 1020 

ACCTGTACGG GGCCAATGGC GGGGA^AAAA CGGGCGAGAG AGTCGCGAAC ATCAGAGTGG 1080 

GAAGCAT'oT AGAGC"" r G7 3CGACGGCCG 7AGCGGCC2A CGAG7G7GGA CAGCACGGCC 1140 

TTGCGCTGGC TCGGGGCGGC ZATGCGGCAG "GCACAATGT CTGTTAATTC AAATGTTACG 1200 

ACAC i ATCAC ^GGTGu'jAb C7^ . juGuC ^GuTAGAGAA GGCCC7G77C GAGCTCGGGG 1260 

CAGb'jTuu ; A GAACAG^'G^ aa_A u ^-ACA GG7C~ 1295 

Identity/ of this sequence with sequences in 
etiologic agents has been confirmed by locating a 



20309587 
040491 



)i^^^ponding sequer.ee in a viral^^i 



coi^^ponding sequence in a viral^rain isolated in 
Burma. The Burmese isolate contains the following 
sequence of nucleotides (one strand and open reading 
frames shown;:. The following gene sequence has SEQ ID 
NO . 6 ; the protein sequence corresponding to 0RF1 has 
SEQ ID NO. 7; CRF2 has SEQ ID NO . 8 ; and 0RF3 has SEQ ID 
NO. 9. 



S^'jZ^CI :F HEV (BURMA STRAIN) 
10 -0RFl--> 

M E A H Q F : K A P G 
AGGCAGACCAC-"A'GT3G'I3A'3CCAT3GAGGCCCA r CAGTTTATTAAGGCTCCTGGC 

I T T A ! E : A A L A A A N S A L A N A 

is atcactactgc t a t ^gag;aggc"gc:'agcagcggccaactctgccctggcgaatgct 120 

VVVS^rLSHQgiElLINLMQ 

gtggtagttaggc: t tt"c t ctc t cac:agcagattgagatcctcattaacctaatgcaa 

20 prqlvfrpevfwnhpiqrvi 

cctcgccagcttgttttccgccccgaggttttctggaatcatcccatccagcgtgtcatc 240 



25 



HNELELYCRARSGRCLEIGA 
CATAACGAGCTGGAGCTTTACTGCCGCGCCCGCTCCGGCCGCTGTCTTGAAATTGGCGCC 

HPRSINDNPNVVHRCFLRPV 
CATCCCCGCTCAATAAATGATAATCCTAATGTGGTCCACCGCTGCTTCCTCCGCCCTGTT 360 

GRDVQRWYTAPTRGPAANCR 
30 GGGCGTGATGTTCAGCGCTGGTATACTGCTCCCACTCGCGGGCCGGCTGCTAATTGCCGG 

RSALRGLPAADRTYCLDGrS 
CGTTCCGCGCTGCGCGGGCTTCCCGCTGCTGACCGCACTTACTGCCTCGACGGGTTTTCT 480 

35 GCNFPAETGIALYSLHOMSP 
GGCTGTAACTTTCCCGCCGAGACTGGCATCGCCCTCTACTCCCTTCATGATATGTCACCA 

SOVAEAMFRHGMTRLYAALH 
TCTGATGTCGCCGAGGCCATGTTCCGCCATGGTATGACGCGGCTCTATGCCGCCCTCCAT 600 

LPPEVLLPPGTYRTASYLLI 
CTTCCGCCTGAGGTCCTGCTGCCCCCTGGCACATATCGCACCGCATCGTATTTGCTAATT 



40 



H 0 G R R V V V 7 v E G D T S A G Y N H 
45 CATGACGGTAGGC3CGTTGTGGTGACGTAT3AGGGTGATACTAGTGCTGGTTACAACCAC 720 

DVSNLRSWIR7TKVTGDHPL 
GATGTCTCCAAC'TGCGCTCCTGGATTAGAACCACCAAGGTTACCGGAGACCATCCCCTC 

50 VI ERVRAIGCHFVLLLTAAP 

GTTATCGAGCGGG*TAGGGCCAT'GGC"G:CACTTTGTTCTCT t GCTCACGGCAGCCCCG 840 



55 



epsp^oyvs'srstevyvrs 
gagccatcacc'a'gc:"a'gt'::'^ac:i::ggt: t accgaggtctatgtccgatcg 
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15 



30 



" G 3 G 3 * : 5 _ - = ~ : C — S ' < 5 

atc":ggc::3gg t ggi-: — ::':-'gc^c:ac t aagtcgacc 960 

T~C~A73 2^ j'C „ "G 1 - A j „ j "7. G j. -ACC77GGA7 

o q a f : c 5 r - M ' ' . ; 3 : 5 < < v 

GACCAAGCC'7 , . ^"v." - - j GA^C.A^-. ^ j_ j\jCA77AjC7ACAAGG7C 1080 

7 V i r : 7 A n E G n N A : E D A L T A 
AC7G77oG7AfC^. . G > jGC ■ m . m : jA^jo-l jGAA ; a^C , ^7GAGGAC jCCG i CACAGC7 

v r 7 a a < _ - : : h q r • _ r - g a i 

G77A7CAC7GCC jv- I^a^ . . o^AG^Au^ jo : A ; C7CCjCACCCAGGC7A7A 1200 

SKGMRS.EREHA^r'ITRLY 
7CCAAGGGGA7GC G"i"CG7C ■ jGAALoo-jA^^A . j^uCAa:AAG777A7AACACGCC7C7AC 



S W L - E < S G - D ^ I 3 G R Q L E F Y 
20 AGC "C7AC 1320 

AQCRR* lSAG"HL?PRVLVF 
GCCCAG i olAGuu jC : Go^ ■ ^ . ^o^jo. ■ ■ 'lAT 1 ^. . Gh.7CCACGGG7G77GG7TT7T 

25 0ESAPCHC57AIRKALSKFC 

GACGAG7CGGCCCCC7GCCA' rT GTAGGACCGCGATCCG7AAGGCGC7C7CAAAG7777GC 1440 

CFMKWLGQEC'C'LQPAEGA 
7GC7 i CA7GAAG ■ oGC ' G : j . C.-'ouAb i „ i G~ . : : ■ CAGCC7GCAGAAGGCGCC 



VG0QGHDNEAYEGS0VDPAE 
GTCGGCGACCAGGG7CATGA7AA7GAAGCC7A7GAGGGG7CCGATGTTGACCCTGCTGAG 1560 



S A I S D I S G S Y V V ? G 7 A L Q P L 
35 7CCGCCA77AG7GACA7A7C7GGG7CCTA7G7CG'CCC7GGCAC7GCCC7CCAACCGC7C 



YQALDLPAEIVARAGRLTAT 
TACCAGGCCC7CGA7C7CCCCGC7GAGA77G7GGC7CGCGCGGGCCGGC7GACCGCCACA 1680 

40 VKVSQVDGRIDCETLLGNKT 
GTAAAGGTCTCCCAGGTCGA7GGGCGGATCGATTGCGAGACCC7TCTTGGTAACAAAACC 

FRTSFVDGAVLETNGPERHN 
777CGCACG7CG77CG77GACGGGGCGG7C7TAGAGACCAA7GGCCCAGAGCGCCACAA7 1800 

45 LSFDAS05TMAAGPFSLTYA 
CTCTCCTTCGATGCCAG7CAGAGCAC*TA7GGCCGC'GGCCC77TCAGTCTCACCTATGCC 

A S A A G L E V R f A A G L 0 H R A V 

-0 GCC7C7GCAGw . Gcji .jj^uj ■ jCji ■ G ■ ul ■ j*_ i ,GvjoC""7GACCA7CGGGCGG77 1920 

FAPG 'v'S^RSA-GE / 7 A r C S A 
777GCCCCCGG7G777CAC7ZCGvj7CAGwICGIGGCGAGG"ACCGCC77C7GC7C7GCC 

55 LYR-NREA^RyS-'GNLWFH 

C7A7ACAGG77TAACCG AC77A7GG77CCA7 2040 

P E G . : G . - A 3 f s p 3 h V W E S A 
CC7GAGGGAC7CAT'GGCC"CCGCCCZjT""CGCC 3GGGCA7G777GGGAG7CGGC7 

50 
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20 
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35 



40 



45 



50 



55 



60 



N 3 r I 3 E S ~ 

AA i v_ ^ A , > w , 3 ' 3 j ^ j A j j 3 - > 

s s 3 a 3 = : _ 

i u i ^ o i l.^AGC33jj_~3A^~~. 

A T P r i i ; d 
S A ' A _ A E ~ 

H Q 7 A P i S R 
C AC CAGA „ 3uC 1 2 jGC A C C 3 C 3 - 



* " R T "< S E 7 0 A V 

"acac::g~ac~~gg~:3gagg77ga7gccgtc 2ieo 

- ^ S E 3 s : P S R A 

- j . . . jAGv-2 ~-7A~ACC7AG7AGGGCC 

3 ^^A30 3 5?PP 

:::::::ctgcaccggacc:t'ccccccctccc 2230 
3 g a 7 a g a p a i t 

- ^jv. j;. . ACCGCCGGGGCCCCGGCCATAACT 

- r * V ? D G S < V F 

;C ~3"3A CC7ACCCGGA~GGC7C7AAGG7A77C 24C0 



A G 5 L ~ E 5 ~ C * * l V N A S N V D H 
GCCGGC7CGC7G7~:GAG733ACA7G3AC37GGC T CG77AACGCG7C7AA7G77GACCAC 



RPGGGlCHA 



7 3 R Y P A S F D A 
77ACCAAAGG7ACCCCGCC7CC7TTGATGCT 2520 



ASFVMRDGAAAYTLTPRPII 
GCC7C7777G7GA T GCGCGACGGCGCGGCCGCG7ACACAC7AACCCCCCGGCCAATAATT 



H A V A D D 
CACGCTG7CGCC 



' 3 L E H N P K R L E A A Y 
'A7AGG77GGAACA7AACCCAAAGAGGCTTGAGGCTGCTTAT 2640 



RE7CSRLG7AAYPLLGTGIY 
CGGGAAACTTGCTCCCGCCTCGGCACCGCTGCATACCCGCTCCTCGGGACCGGCATATAC 

QVPIGPSFDAWERNHRPGDE 
CAGGTGCCGATCGGCCCCAG7777GACGCC7GGGAGCGGAACCACCGCCCCGGGGATGAG 2760 

LYLPELAARWFEANRPTRPT 
TTGTACCTTCCTGAGCTTGCTGCCAGATGGTTTGAGGCCAATAGGCCGACCCGCCCGACT 

LTITEDVARTANLAIELDSA 
CTCAC7A7AACTGAGGATG77GCACGGACAGCGAATCTGGCCATCGAGCTTGACTCAGCC 2880 

TOVGRACAGCRVTPGVVQYQ 
ACAGATGTCGGCCGGGCCTGTGCCGGCTGTCGGGTCACCCCCGGCGTTGTTCAGTACCAG 

FTAGVPGSGKSRSITQADVD 
TTTACTGCAGG7G7GCC7GGA7CCGGCAAGTCCCGCTCTATCACCCAAGCCGATGTGGAC 3000 

VVVVPTRELRNAWRRRGFAA 
GTTGTCGTGGTCCCGACGCG7GAGTTGCGTAATGCCTGGCGCCGTCGCGGCTTTGCTGCT 



FTPH7A4RV* 

TTTACCCCGCATAC'GCCGCCAGAG^ZAC: 

PSLPPhLll^ 



QGRRVVIDEA 

zaggggcgccgggttgtcattgatgaggct 3120 

hmqraatvhl 
:acatgcagcgggccgccaccgtccacctt 



L G 0 P n Q ; s a ; 3 
CTTGGCGACCCGAACC AGA7CC CAGC 3A~C 3AC 



ISPOLGP'S 
A7CAGGC 



ehaglvpa 
7gagcacgc7gggc7cgtccccgcc 3240 

* h v 7 h r w p a d 

3g'ggca7g77accca7cgc7ggcctgcggat 
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15 



30 



45 



L . 3 Q ' 5 R 7 I R 

gtatgcgagc*:-'i:3 t ig t ::.a"a::::-''^."::;ia,::ac';3c:ggg^tctccgt 3360 
s l z « 3 e ~ a / ^ : < _ ; " 7 Q A a < 

'CG 1 . Gi 1 j'ocj j.^o- v, ■ ov. w j -j'jj-.-jf-A^. AG j . "„Al ECAGGCGGC CAAG 
3 A N = 3 5 v ■ "T v - E A 0 3 A T / T E T 

c::g:caac::: 33 :':a3'^:33': 1-33:3 3:3:a33g:gcac:tacacggagacc 34so 

acta7tat'gc:-:-g:-g-'3:::g3gg::"a'*::g':g-:':3,33C"wtgccatt 

v a l t r -< : e < : ; : : : a = g l l r 
gttgctctgacg:gc:a:a.:'gagaag*g:g*:-":a"3a:gca::aggcctgcttcgc 3600 

e v g : s :■ a : v n n - - . a g g e 1 g 

G AGG TG GG Z A ~3 ' G Z 3 A ' 3 C - A " 3 3 a - ~ - ; r E C ~ 3 3 G " 3 G ~G G C GA A. ATTGGT 



H Q R P S ; : = R G l i ? 3 a 'J ; D T L A 
20 CACCAGCGCCCA'C-G a" EECG3~333 : -CEC T GACGCEAATGTTGACACCCTGGCT 3720 

AFPPSC3ISA--QLAEELGH 
GCCTTCCCGCCG i C ; i oCE-'oA . i AG7G^v"^CA7CAG"GGCTGAGGAGCTTGGCCAC 

25 RPVPVAA7LPPC?ELEQGLL 

AGACCTGTCCCTG77GCAGC'GTTC:AC:ACCC t GCECCGAGC^CGAACAGGGCCTTCTC 3840 



YLPQELTTCDSVVTFELTOI 
TACCTGCCCCAGGAGC'CACCAC:TGTGA7AGTG t CGTAACATTTGAATTAACAGACATT 

vhcrmaapsqr<avlstlvg 

GTGCACTGCCGCA7GGCCGCCCCGAGCCAGCGCAAGGCCG t GCTGTCCACACTCGTGGGC 3960 



RYGGRTKLYNASHSOVRDSL 

35 cgctacggcggtcgcacaaagctctacaatgcttcccac^ctgatgttcgcgactctctc 

A R F r p A : G P V Q V T 7 C E L Y E L 
GCCCGTTTTATCCCGGCCA'7GGCCCCGTACAGG' t ACAAC77GTGAATTGTACGAGCTA 4080 

40 V E A M V E k G Q 0 G S AVLELDLC 

GTGGAGGCCATGGTCGAGAAGGGCCAGGA T GGCTC3GCCGTCC7TGAGCTTGATCTTTGC 



NRDVS3 I 7 F - Q < ECNKFTTG 

aaccgtgacgtgtccagga^eac:t7c tt ccagaaaga7tgtaacaagttcaccacaggt 4200 

E T I A H G *. V G Q 3 I S A W S < T F C 
GAGACCATTGCCCATGG t aaaGTGGGCEAGGGCA"C'3GGCC t GGAGCAAGACCTTCTGC 



alfgpw"ra;e<a;^allpq 
50 gccctctt7ggcc:*"gg7" r 3 3gcgc'a7^gagaaggc t a7'c'ggccctgctccctcag 4320 

GVFY30 A FOOT 7 FSAAVAAA 
GG TGTGT7TT A. C GG 'G A T GCC"T'7GA"GA CAC EG 73 77CTCGGCGGCTGTGGCCGCAGC A 

55 K A S M V ? E N 0 " 5 E - D 5 7 Q N N F 

AAGGCATCCA7GG"G7"GAGAA'GAC""E"3AG777GAC V CCACCCAGAATAACTTT 4440 



S L G L E C A : M E E 3 G M p Q W L I R 
i C . CTGGu i L . *u.-u j ■ . -» A:jj«ojA.j~j"GGGA~jCCGCAGTGGCTCATCCGC 



60 
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LYHLIRSAWILQAPKESLRG 
C"GTATCAC:TTATAAGGTC T GCGTGGATCTTGCAGGCCCCGAAGGAGTCTCTGCGAGGG 4560 

FWKKHSGEPGTLLWNTVWNM 
5 TTTTGGAAGAAACACTCCGGTGAGCCCGGCACTCTTCTATGGAATACTGTCTGGAATATG 

AVITHCYDFROFQVAAFKGD 
GC:3TTATTAC:CACTGTTATGACTTCCGCGATTTTCAGGTGGCTGCCTTTAAAGGTGAT 4680 

10 3 5 I V L C S E Y R Q S P G A A V L I A 

GA"CGATAG'GCTTTGCAGTGAGTATCGTCAGAGTCCAGGAGCTGCTGTCCTGATCGCC 



15 



40 



45 



50 



55 



GCGLKLKVDFRPIGLYAGVV 
GGCTGTGGCTTGAAGTTGAAGGTAGATTTCCGCCCGATCGGTTTGTATGCAGGTGTTGTG 4800 

VAPGLGALPDVVRFAGRLTE 
GTGGCCCCCGGCCTTGGCGCGCTCCCTGATGTTGTGCGCTTCGCCGGCCGGCTTACCGAG 



KNWGPGPERAEQLRLAVSDF 
20 AAGAATTGGGGCCCTGGCCCTGAGCGGGCGGAGCAGCTCCGCCTCGCTGTTAGTGATTTC 4920 

LRKLTNVAQMCVDVVSRVYG 
CTCCGCAAGCTCACGAATGTAGCTCAGATGTGTGTGGATGTTGTTTCCCGTGTTTATGGG 

25 VSPGLVHNLIGMLQAVAOGK 

GTTTCCCCTGGACTCGTTCATAACCTGATTGGCATGCTACAGGCTGTTGCTGATGGCAAG 5040 



AHFTESVKPVLlOLTNSILCR 
30 GCACATTTCACTGAGTCAGTAAAACCAGTGCTCGACTTGACAAATTCAATCTTGTGTCGG 

|-0RF3— > 

MNNMSFAAPMGSRPCALG 

M R P R P 

35 V E Z |-0RF2--> 

GTGGAATGAATAACATGTCTTTTGCTGCGCCCATGGGTTCGCGACCATGCGCCCTCGGCC 5160 

LFCCCSSCFCLCCPRHRPVS 
ILLLLLMFLPMLPAPPPGQP 

TATTTTGTTGCTGCTCCTCATGTTTTTGCCTATGCTGCCCGCGCCACCGCCCGGTCAGCC 

RLAAVVGGAAAVPAVVSGVT 
SGRRRGRRSGGSGGGFWGOR 

GTCTGGCCGCCGTCGTGGGCGGCGCAGCGGCGGTTCCGGCGGTGGTTTCTGGGGTGACCG 5280 

GLILSPSQSPIFrqPTPSPP 
VDSQPFAIPYIHPTNPFAPD 

GGTTGATTCTCAGCCCTTCGCAATCCCCTATATTCATCCAACCAACCCCTTCGCCCCCGA 

MSPLRPGLDLVFANPPDHSA 
VTAAAGAGPRVRQPARPLGS 

TGTCACCGCTGCGGCCGGGGCTGGACCTCGTGTTCGCCAACCCGCCCGACCACTCGGCTC 5400 



20309587 
04C591 



19. 



20 



30 



G J ~ - ; i a : - _ - - / / j l ? Q 



- «- j ..j ~jAC~iAC2ACAGl 



L o ■* ^ ■< 

G a a 3 l = = ? v p d v 

~GG3GCCjGGICGCTAA'"GIj.:"GI" , "-_.:j:IIGA"a " agzGCGZCAG"CCTGATGT 5520 
D 3 R G A ; . : = ; - _ 5 - s ? L T S 

CGni, . ilC ji- jov^ w - - j — jij^-j - "*-'_.. . . n v„ ■ ^ i ^ C C i i A C C i C 
15 S 7 A 7 G * N . . _ ' A A 3 . 5 ? L L P 



ttccg t ggccac:gg:a:"aa::'gg":"'^gc:g:c:: t ctagtccgcttttacc 5540 

L 0 0 G T n ' - I M A T E A 5 N Y A Q Y 

ccttcaggacggcac:aa"^c::a'- t aa'gg::acggaagc t tctaattatgcccagta 

R V A P A 7 I R - ■ h 3 . V P N A V G G Y 

25 ccgggttgcccgtgccacaatc:g t ^ac:gc:cgc^ggtcc:caatgctgtcggcggtta 5760 

AISIS r W?Q77'7?7SVDMN 



CGCCATCTCCATCTCATTCTGGCCACAGACCACCACCACCCCGACGTCCGTTGATATGAA 
S I T S 7 D V R I L'/QPGIASELV 
TTCAATAACCTCGACGGATGTTCGTATTTTAGTCCAGCCCGGCATAGCCTCTGAGCTTGT 5880 
35 I P S E R L H i R H Q G « R S 7 £ T S G 

GATCCCAAGTGAGCGCCTACAC*ATCGTAACCAAGGCTGGCGCTC CG7CGAGACC7C7GG 
V A E E E A T S G L 7 M L C I H G S L V 

40 

GGTGGCTGAGGAGGAGGCTACC'CTGGTCTTGTTATGCTTTGCATACATGGCTCACTCGT 5000 
NSYTNTPY7GALGLL0FALE 

45 AAATTCCTATACTAATACACCC^ATACCGGTGCCCTCGGGCTGTTGGACTTTGCCCTTGA 

LEFRNLT 3 GN r N7PVSRYSS 

GCT7GAG7TTCG'v,AACC"A L G 1 _ZG3^AACACCAA7AC 3CGGG7Z7CCCG7TATTCCAG 6 120 

50 

TAPHPL^^GADGTAELTTTA 
CAC7GC7CGCCACCcl ^"v-u ^l^-j jlGGACGGGAC7GCCGAGC7CACCACCACGGC 
55 A7RF U <D„' "'S'NG/GEIG 

iu^iACCloC' ! i ^ > y.^-jj." 1 .^ „ . - ■ ■« . . - j , ^ , mm ; j'j i j ; V.3G7GAGA7CGG 6240 



20309587 
040491 




R 




r 



L L G G L P T 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



20309587 
040491 



CCGC3GGATAGCC:TCAC:: ,r 3'*:Ai::'-GC*3ACACTCTGC ,r TGGCGGCCTGCCGAC 
ELI 5SAG3j_ r '3RPVVSAN 



GE?TV<L''SVENAQQDKGI 
TGGCGAGCCGACTGT'AAGTTGTiTACA'CTGTAGAGAATGCTCAGCAGGATAAGGGTAT 

A I 2 - D : 0 L 3 E 5 3 V v I Q 0 Y D N 
TGC A A TC C C GC A ToA C A T'GA C CTGGG AG A A 'C'CG "3 TGGTTATTCAGGATTATGATAA 6480 

ccaacatgaacaagat:ggc:gacgc:"ctc:agc:c:atcgcgccctt7ctctgtcct 

R A N D V L * l S L 7 A A E Y D Q S T Y 

TCGAGCTAATGATGTGCTTTGGCTCTCTCTCACCGCTGCCGAGTATGACCAGTCCACTTA 6600 
GSSTGPVYVSDS7TL. VNVAT 

TGGCTCTTCGACTGGCCCAGTTTATGTTTCTGACTCTG7GACCTTGG7TAATGTTGCGAC 

GAQAVARSLDW7KVTLDGRP 

CGGCGCGCAGGCCG77GCCCGG7CGC7CGA77GGACCAAGG7CACAC77GACGG7CGCCC 6720 
LS7IQQYSK7FFVLPLRGKL 

CC7C7CCACCA7CCAGCAG7AC7CGAAGACC77C777G7CC7GCCGC7CCGCGG7AAGC7 

SFWEAG77KAGYPYNYNTTA 

CTC777C7GGGAGGCAGGCACAAC7AAAGCCGGG7ACCC77A7AA7TA7AACACCACTGC 6840 
SDQLLVENAAGHRVA IS7Y7 

7AGCGACCAAC7GC77G7CGAGAA7GCCGCCGGGCACCGGG7CGC7A777CCAC77ACAC 
TSLGAGPVSISAVAVLAPHS 

CAC7AGCC7GGG7GC7GG7CCCG7C7CCA777C7GCGG77GCCG77TTAGCCCCCCACTC 6960 
ALALLED7LDY2ARAH7FDD 

TGCGC7AGCA77GC77GAGGA7ACC77GGAC7ACCC7GCCCGCGCCCA7ACTTTTGATGA 
f r CPECRPLGLQGCAFQS7VA 

777C7GCCCAGAG7GCCGCC::C7-3GCC"CAGGGC7GCGC777CCAG7C7AC7G7CGC 7030 
ELQRL<MKVG<7RELZ 

7GAGC" T CAGCGCC7"AAGA'3AA33'3G3*A J ::AC T :GGGAG77G7AGrr7A777GC77 




G7CCCG77G7C7CAGCCAA 6360 



• _ • 

Total number of cases in this sequence as 
presented is 7 135. The ccly-A tail present in the 
5 cloned sequence has been omitted. 

The ability of the methods described herein to 
isolate and identify genetic material from other NANS 
hepatitis strains has been confirmed by identifying 
genetic material from an isolate obtained in Mexico. 

10 The sequence of this isolate was about 75% identical 
to the ETi.l sequence set forth in SEQ ID NO . 1 above. 
The sequence was identified by hybridization using the 
conditions set forth in Section II. B below. 

In this different approach to isolation of the 

15 virus, cDNA libraries were made directly from a semi- 
purified human stool specimen collected from an 
outbreak of ET-NANE in Telixtac. The recovery of 
cDNA and the construction of representative libraries 
was assured by the application of sequence independent 

20 single premier amplification (SISPA). A cDNA library 
constructed in lambda gtll from such an amplified cDNA 
population was screened with a serum considered to 
have "high" titer anti-HEV antibodies as assayed by 
direct immunofluorescence on liver sections from 

25 infected cynos . Two cDNA clones, denoted 406.3-2 and 
406.4-2, were identified by this approach from a total 
of 60,000 screened. The sequence of these clones was 
subsequently localized to the 3' half of the viral 
genome by homology comparison to the HEV (Burma) 

30 sequence obtained from clones isolated by 

hybridization screening of libraries with the 
original ET1.1 clone. 

These isolated cONA epitopes when used as 
hybridization probes on Northern blots of RNA 

35 extracted from infected cyno liver gave a somewhat 

different result when compared to the Northern blots 
obtained with the ET1.1 probe. In addition to the 
single 7.5 Kb transcript seen using ET1.1, two 
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addi^Sbnai transcripts cf 3.7 and 2.0 Kb were 
identified using either of these epitopes as 
hybridization probes. These po iyadenyia ted 
transcripts were identified using the extreme 3' end 
5 epitope cione (406.3-2) as probe and therefore 

established these transcripts as co-terminal with the 
3' end of the genome (see below). One of the epitope 
clones (405.4-2) was subsequently shown to react in a 
specific fashion with antisera collected from 5 

10 different geographic epidemics (Somalia, Burma, 

Mexico, Tashkent and Pakistan). The 406.3-2 clone 
reacted with sera from 4 out of these same 5 epidemics 
(Yarbough et al., 1990). Both clones reacted with 
only post inoculation antisera from infected cynos . 

15 The latter experiment confirmed that seroconversion in 
experimentally infected cynos was related to the 
isolated exogenous cloned sequence. 

A composite cDNA sequence (obtained from several 
clones of the Mexican strain) is set forth below. 

20 Composite Mexico strain sequence (SEQ ID NO. 10): 
SEQ ID NO. 10 : 





GCCATGGAGG 


CCCACCAGTT 


CA77AAGGC7 


CC7GGCATCA 


CTACTGCTAT 


TGAGCAAGCA 


60 


25 


GCTCTAGCAG 


CGGCCAACTC 


CGCCC77GCG 


AATGC7G7GG 


TGG7CCGGCC 


TTTCCTTTCC 


120 




CATCAGCAGG 


TTGAGATCCT 


7ATAAA7C7C 


A7GCAACC7C 


GGCAGCTGG7 


GT7TCGTCCT 


180 




GAGGTTTTTT 


GGAATCACCC 


GA77CAACG7 


G77A7ACATA 


ATGAGCTTGA 


GCAGTATTGC 


240 


30 


















CGTGCTCGCT 


CGGGTCGCTG 


CC77GAGA77 


GGAGCCCACC 


CACGCTCCAT 


TAA7GATAAT 


300 




CCTAATGTCC 


TCCATCGC'G 


CTT7CTCCAC 


CCCG7CGGCC 


GGGA7GT7CA 


GCGCTGGTAC 


360 


35 


ACAGCCCCGA 


CTAGuuGmC l 


; GCoGlGAAC 


i G7CGCCGC7 


CGGCACTTCG 


7GG7C7GCCA 


420 




CCAGCCGACu 


OL.-*.L ! ' Al ; J 






GCCG7TTTGC 


CGCCGAGACT 


480 




GGTGTGGCTC 


T C T A7TC r C 7 


CCATGAC7T3 




A7G77GCCGA 


GGCGATGGCT 


540 


40 


















CGCCACGGCA 


-rr * r ^ ^ " ~" 


77A7GC ~GC" 


"TCCACTTGC 


C7CCAGAGG7 


GCTCCTGCCT 


600 








** ' u „ - l . a 


■ a A ■ v.\..".Cj 


A7GG7AAGCG 


CGCGGTTGTC 


660 


45 


ACT7A7GAGG 


■J I 'J >+■ \- " l ~ ■ - u 






7 T GCCACCC7 


CCGCACATGG 


720 
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7CAGGACAA CAAoG j joG*jAA-ZA^ .C'oGTGA TCGAGCGGGT GCGGGGTATT 780 

ggctgtcac" "G"g:'3" :a-:::~i:i :::::"agc ::~c -v.wjAT GCCCTACGTT 840 

cc"ac::gc 3":3~:3g~ 3g~:~^~3~: :gg*c*a*:* "gggcc:gg cgggtccccg 900 

tcg"3~t:: :gac:gc"i *g:"~:--g ~::a:""c -cgcggtc:: cacgcacatc 960 

tggga::3^: *;a"gc:" ~ggg~ : ::--:: :*:g~:3a.;: ~g3::~~"g :tgc^ccagg 1020 

CTTATGACGT .-C:~~:3~33 -~GG~AAC~3 'GGGTGCC37 GGTCGCTAAT 1080 

gaaggcgga atgccaccg- 3G-*-::r: -ctgcagt^ vacggcggc ttacctcaca 1140 

ATATGTCA T C AGCG"A~" ;CG3~:::AG 3CjA^"C^A -GGGCA7GC3 CCGGCTTGAG 1200 

CT7GAACA7G C T CAG-Ai CAC3C :~:~ACAGC7 GGCTATTTGA GAAG7CAGGT 1260 

CG7GAT7ACA 7CCCAGGCCG :C~GC~33~G "C-ACGC7C AG'GCCGCCG C7GG77A7C7 1320 

GCCGGGT7CC A7C7CGAC" CCGCACC"A G77777GA7G AG T CAG7GCC 77G7AGC7GC 1380 

CGAACCACCA TCCGGCGGA^ CGC~33 — ~A "TTTGC T GT7 77ATGAAG7G GC7CGG7CAG 1440 

GAGTG77C77 G7T7CC7C:A GCCCGC3GAG GGGC7GGCGG GCGACCAAGG 7CA7GACAA7 1500 

GAGGCC7A7G AAGGCTCTGA TG7TGA7AC7 GC7GAGCC7G CCACCC7AGA CA77ACAGGC 1560 

7CA7ACA7CG 7GGA7GG7CG GTCTC'GCAA AC^G7C7A7C AAGC7C7CGA CC7GCCAGC7 1620 

GACC7GG7AG C7CGCGCAGC CCGAC7G7C^ GCTACAGTTA C7G77AC7GA AACC7C7GGC 1680 

CG7C7GGAT7 GCCAAACAA7 GATCGGCAA7 AAGAC7777C 7CAC7ACC77 7G77GA7GGG 1740 

GCACGCC77G AGG77AACGG GCG7GA.GCAG C77AACCTC7 C7777GACAG CCAGCAG7G7 1800 

AG7A7GGCAG CCGGCCCG7" "TGCC'CACC *A7GC7GCCG 7AGA7GGCGG GC7GGAAG77 1860 

CA77777CCA CCGC7GGCC C3AGAGCCG7 G7TG7777CC CCCC7GG7AA 7GCCCCGAC7 1920 

GCCCCGCCGA G7GAGGTCAC CGCC77C7GC 7CAGC7C777 A7AGGCACAA CCGGCAGAGC 1980 

CAGCGCCAG7 CGG7 T A7"GG ~AG T "G7GG C7GCACCC7G AAGG777GC7 CGGCC7G77C 2040 

CCGCCC7777 CACCCGG5CA 'GAGTGGCGG "CTGC^AACC CAT777GCGG CGAGAGCACG 2100 

C7C7ACACCC GCAC77GG': CACAA7'ACA GACACACCC7 7AAC7G7CGG GC7AAT77CC 2160 

GG7CA77"G ATGC'GC": ICAC'CGGGG GGGCCACC'G C7AC7GCCAC AGGCCCTGCT 2220 

GTAGGC7CG7 C~^M~C~~: -GACCC'GAC :CGC7ACC7G A7G77ACAGA TGGC7CACGC 2280 

ccc7c7gggg cccg'zcggi "gg::i:--c ::3aa'ggcg "ccgcagcg ccgc77ac7a 2340 

cacacc'acc :^ga;gg:3; -:-ga~:~-~ 3':gg:':ca 7— tcgag7c tgag7gcacc 2400 



24 . 



10 



iGGCTTGTCA ACGCA~_7~A ,jC"":.jCCAC .GlCCT'jGTG GCGGGCTTTG TCATGCTTTT 2460 

tttcagcg" ac::~i~~ t : g g- :gc: ^cca:gtttg ~gatgcgtga tggtcttgcc 2520 

GCG7ATACCC * t acACZC:G GGCGA'GAT* CA'GCGGTGG CCCCGGACTA TCGATTGGAA 2580 

CATAACCCCA AGAGGC'IGA GGC'GCCTAC CGCGAGACTT GCGCCCGCCG AGGCACTGCT 2640 

gc"a ~™-g:=:]c "gca~—ac caggtgcctg ttagtttgag ttttgatgcc 2700 

tgggagcgga aczaccgzi: gt'tgacgag ctttacctaa cagagctggc ggctcggtgg 2760 

tttgaa'c:a ac:gc::cgg tcagcc:acg -tgaacataa c t gaggatac cgcccgtgcg 2820 

15 gccaacctgg :::*ggagc "ac^ccggg agtgaagtag gccgcgcatg tgccgggtgt 2880 

AAAGTCGAGC C*"GGCG"G~" GCGGTA t CAG "7ACAGCC3 GTGTCCCCGG CTCTGGCAAG 2940 

TCAAAGTCCG "GCAACAGGC GGA'G'GGAT gttgttgttg TGCCCACTCG CGAGCTTCGG 3000 

20 

AACGCTTGGC GGCGCCGGGG C'T'GGGGCA TTCACTCCGC ACACTGCGGC CCGTGTCACT 3060 

AGCGGCCGTA GGGTTGTCAT TGATGAGGCC CCTTCGCTCC CCCCACACTT GCTGCTTTTA 3120 

25 CATATGCAGC GTGCTGCATC TGTGCACCTC CTTGGGGACC CGAATCAGAT CCCCGCCATA 3180 

GATTTTGAGC ACACCGGTCT GATTCCAGCA ATACGGCCGG AGTTGGTCCC GACTTCATGG 3240 

TGGCATGTCA CCCACCGTTG CCCTGCAGAT GTCTGTGAGT TAGTCCGTGG TGCTTACCCT 3300 

30 

AAAATCCAGA CTACAAGTAA GGTGCTCCGT TCCCTTTTCT GGGGAGAGCC AGCTGTCGGC 3360 

CAGAAGCTAG 7GTTCACACA GGCTGCTAAG GCCGCGCACC CCGGATCTAT AACGGTCCAT 3420 

35 GAGGCCCAGG GTGCCACTTT TACCACTACA ACTATAATTG CAACTGCAGA TGCCCGTGGC 3480 

CTCATACAGT CCTCCCGGGC TCACGCTATA GTTGCTCTCA CTAGGCATAC TGAAAAATGT 3540 

GTTATACTTG ACTCTCCCGG CCTGTTGCGT GAGGTGGGTA TCTCAGATGC CATTGTTAAT 3600 

40 

AATTTCTTCC TTTCGGGTGG CGAGGTTGGT CACCAGAGAC CATCGGTCAT TCCGCGAGGC 3660 

AACCCTGACC GCAATGT7GA CGTGCTTGCG GCGTTTCCAC CTTCATGCCA AATAAGCGCC 3720 

45 TTCCATCAGC TTGCTGAGGA GC'GGGCCAC CGGCCGGCGC CGGTGGCGGC TGTGCTACCT 3780 

CCCTGCCC7G AGCTTGAGCA GGGC:":" T ATC^GCCAC AGGAGCTAGC CTCCTGTGAC 3840 

AGTGTTGTGA CATTTGAGC iACTGACA" GTGCACTGCC GCATGGCGGC CCCTAGCCAA 3900 

50 

AGGAAAGC'G ""G7CCAC GCTGG'AGGC CGGTATGGCA GACGCACAAG GCTTTATGAT 3960 

GCGGGTCACA CCGATGTCCG CGCC'C::" GCGCGCTTTA TTCCCACTCT CGGGCGGGTT 4020 

55 ACTGCCACCA CZ'GTGAAC Z~"3AGC" G T AGAGGCGA TGGTGGAGAA GGGCCAAGAC 4080 

20309587 
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ggttcagcc j gg-'"j^g: -g: :3aga~3 t : t 33CGCat aacctttttc 4i40 

CAGAAGGA77 3 t aa:~a3" Z-CG-C'IS^: "GCA73GCAA AGTCGGTCAG 4200 

GGTATCTTCC GC"GAG~aa 3iC3""G' GC::'G"'G 3CC3C~GG77 CCGTGCGATT 4260 

GAGAAGGC7A ~7Z~-~ZZ:T 7"AC:aCAA }C'G'G":" ACGGGGA7GC 77A7GACGAC 4320 

7CAG7A77C7 CTGCGC-j Gu~"TjG-j33 AGC^A'GCJA "GG7G777GA AAA7GA7777 4380 

7C7GAGT77G AC"*:GAC7:a GA A A A 37*!"* 'CZZ'AGG': "GAG7GCGC CA77A7GGAA 4440 

GAG7G7GG7A 7GCCCCAG" GC"G*:AGG " ^a — " 3C3T3C3G7C GGCG7GGA7C 4500 

15 C7GCAGGCCC CAAAAGAG'3 """GAGAGGG "C7GGAAGA AGCA77C7GG 7GAGCCGGGC 4550 

AGCT7GC7CT 3GAA7ACG3* G^GAA" A" 3CAA7CA"T7G CCCA7~TGC7A 7GAG77CCGG 4620 

GACC7CCAGG TTGCCGCC77 CAAGGGCGAC GAC7CGG7CG 7CC7C7G7AG 7GAA7ACCGC 4680 

20 

CAGAGCCCAG GCGCCGG77C GC77A7AGCA GGC7G7GG77 7GAAG77GAA GGC7GAC7TC 4740 

CGGCCGA77G GGC7G7A7GC CGGGG7" T G~C G7CGCCCCGG GGC7CGGGGC CC7ACCCGA7 4800 

25 GTCG77CGA7 TCGCCGGACG GCT7TCGGAG AAGAAC7GGG GGCC7GA7CC GGAGCGGGCA 4860 

GAGCAGC7CC GCC7CGCCG7 GCAGGAT77C C7CCG7AGG7 TAACGAA7G7 GGCCCAGA77 4920 

7G7G77GAGG 7GG7G7C7AG AG777ACGGG 3 T ~73CCCGG GTC7GGT7CA 7AACC7GA7A 4980 

30 

GGCA7GC7CC AGAC T A77GG 7GATGG"*"AAG GCGCA7*77A CAGAGTCTGT 7AAGCC7A7A 5040 

C77GACC77A CACAC7CAA7 7A7GCACCGG '373AATGAA 7AACA7G7GG 777GC7GCGC 5100 

35 CCA7GGG77C GCCACCA73C GCCC7AGG33 TC777*GC7G 77G77CC7C7 7G77TC7GCC 5160 

7A7G77GCCC GCGCCACCGA CCGG7CAGCC G7C7GGCCGC CG7CG7GGGC GGCGCAGCGG 5220 

CGG7ACCGGC GG7GG777C7 GGGG7GACCG GGT~GA77C7 CAGCCC7TCG CAA7CCCC7A 5280 

40 

7A77CA7CCA ACCAACCCC7 77GCCCCAGA CG77GCCGC7 GCG7CCGGG7 C7GGACC7CG 5340 

CC77CGCCAA CCAGCCCGGC CAC7'GGC7C CAC77GGCGA GA7CAGGCCC AGCGCCCC7C 5400 

45 CGC7GCC7CC CG7CGC3GAC C7GCCACAGC CGGGGC T GCG GCGC7GACGG C7G7GGCGCC 5460 

7GCCCA7GAC ACC7CACC3G "C:CGGACG7 "~GA"G"GC GG7GCAA77C 7ACGCCGCCA 5520 

G7A7AA777G TC~AC^^C~Z ZZZ~ZinZ~~Z :7C T G7GGCC 7CTGGCAC7A A777AG7CC7 5580 

50 

G7A7GCAGCC CCCC77A A"C CGCC'C'GC: GCGCAGGAC GG7AC7AA7A C7CACA77A7 5640 

GGCCACAGAG GCCTCCAAT' A^CAC-G^a CCGGGT7GCC CGCGC7AC7A 7CCG77ACCG 5700 

55 GCCCC7AGTG CC7AA7GCAG "GGAGGZ'A "C~a:"A7CC A777C777C7 GGCC7CAAAC 5760 
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JaCCACAACC CC^CA"" " jACA ~GA A "::A"AC t TCCACTGATG TCAGGATTCT 5820 

TGTTCAAC:' GGCA'AGCAT :"3AA"3G' CA^CCIAAGC GAGCGCCTTC ACTACCGCAA 5880 

TCAAGGT T GG CGCT:gG"G AGACA~:~GG TG"GC'GAG GAGGAAGCCA CCTCCGGTCT 5940 

TGTCAT377A 'GCA'ACA'G GC'C'CZAGT TAAC7CCTAT ACCAATACCC CTTATACCGG 6000 



CACCAATACA 


C j a ■ G i v. C ~ 




CAC : GCTCGT 




GAGGGGCCGA 


6120 


CGGGACTGCG 


jAjC"ACIA 


CAAC'GCAGC 


C A C C A G G t i C 


ATGAAAGATC 


TCCACTTTAC 


6180 


CGGCCTTAAT 


GGG'j « -"j a ■ j 


A AG. . jolv. j 


r r ~ ' t • ^ ~ t 

^ J ! J o i H u L . 


CTAACATTAC 


TTAACCTTGC 


6240 


TGACACGCTC 


ctcggcggg: 


l ^ C j A ^ ** G A 


ATTAA-TTCG 




GGCAACTGTT 


6300 


TTATTCCCGC 


CCGGT T G T C T 


r r> r ' i t ^ ^ 


CGAGCCAACC 


GTGAAGCTCT 


ATACATCAGT 


6360 


GGAGAATGCT 




AGGG ■ GT"!*GC 


TATCCCCCAC 


GATATCGATC 


TTGGTGATTC 


6420 


GCGTGTGGTC 


AT T CAGGATT 


ATGAC-ACCA 


GCATGAGCAG 


GATCGGCCCA 


CCCCGTCGCC 


6480 


TGCGCCATCT 


CGGCCTTTTT 


CTGTTCTCCG 


AGCAAATGAT 


GTACTTTGGC 


TGTCCCTCAC 


6540 


TGCAGCCGAG 


TATGACCAGT 


CCACTTACGG 


GTCGTCAACT 


GGCCCGGTTT ATATCTCGGA 


6600 


CAGCGTGACT 


TTGGTGAATG 


TTGCGACTGG 


CGCGCAGGCC 


GTAGCCCGAT 


CGCTTGACTG 


6660 


GTCCAAAGTC 


ACCCTCGACG 


GGCGGCCCCT 


CCCGACTGTT GAGCAATATT 


CCAAGACATT 


6720 


CTTTGTGCTC 


CCCCTTCGTG 


GCAAGCTCTC 


CTTTTGGGAG 


GCCGGCACAA 


CAAAAGCAGG 


D/OU 


TTATCCTTAT 


AATTATAATA 


CTACTGCTAG 


TGACCAGATT 


CTGATTGAAA 


ATGCTGCCGG 


6840 


CCATCGGGTC 


GCCATTTCAA 


CCTATACCAC 


CAGGCTTGGG 


GCCGGTCCGG 


TCGCCATTTC 


6900 


TGCGGCCGCG 


GTTTTGGCTC 


CACGC ; CCGl 


CCTGGCTCTG 


CTGGAGGATA 


CTTTTGATTA 


6960 


TCCGGGGCGG 


GCGCACACAT 


TTGATGACTT 


CTGCCCTGAA 


TGCCGCGCTT TAGGCCTCCA 


7020 


GGGTTGTGCT 


T~-r r ■ <i 


:tgtcg-:*ga 


GC TC CAGCGC 


CTTAAAGTTA 


AGGTGGGTAA 


7080 


AACTCGGGAG 


! ! O I .*■* 1 J . > M 




CCCACCTACT 


TATATCTGCT 


GATTTCCTTT 


7140 




i C i CovjTloC 


Gl'JL ! ^ ^ I G 


A 






7171 



The above sequence was obtained from 
olyadenylated clones. For clarity the 3' polyA 
tail" has been omitted. 



TGCCCT,GGC T: AC"GGAC " T GCC"AGA GC'GAGTTT C3CAATCTCA CCACCTGTAA 



6060 
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The sequence above .-eludes a partial cDNA 
sequence consisting or 1-5-5 1 nucleotides that was 
identified in a previous application in this series. 
The previously identified partial sequence is set 
5 forth below, with certain corrections (SEQ ID NO. 11). 

The corrections include deletion of the first 80 bases 
of the prior reported sequence, which are cloning 
artifacts; insertion of G after former position 174, 
of C after 270, and of GGCG after 279; change of C to 
10 T at former position 709, of GC to CG at 722-723, of 
CC to TT at 1233-39, and of C to G at 1606; deletion 
of T at former position ^55; and deletion of the last 
11 bases of the former sequence, which are part of a 
linker sequence and are not of viral origin. 

strain; SEQ ID NO. 11 

'AA7 T TCTTCCTTT CGGGTGGCGA 60 

:AAC CCTGACCGCA ATGTTGACGT 120 

TIC CATCAGCTTG CTGAGGAGCT 180 

CCC TGCCCTGAGC TTGAGCAGGG 240 

AGT GTTGTGACAT TTGAGCTAAC 300 

AGG AAAGC7GTTT TGTCCACGCT 360 

GCG GGTCACACCG ATGTCCGCGC 420 

ACT GCCACCACCT GTGAACTCTT 480 

GGT TCAGCCGTCC TCGAGTTGGA 540 

C-G AA5GATTGTA ACAAGTTCAC 600 

33' at- — CCGCT GGAGTAAGAC 660 

GAG AAGGCTATTC TATCCCTTTT 720 

"A G~A7 T CTC"G CTGCCGTGGC 780 

"CT GAGT'TGACT CGACTCAGAA 840 

".AG '3TGGTATGC CCCAGTGGCT 900 

'3 CAGGCCCCAA AAGAGTCTTT 960 



15 Non-A Non-B T: Mexican < 

SEQ ID NO. II : 

GTTGCGTGAG GTGGGTA7C7 CAGA'3C:a t "T~aa7 
20 GGTTGGTCAC CAGAGACCAT CGG7C~"CC GCG AGG C 

GCTTGCGGCG TTTCCACCTT CATGCCAAAT AAGCGCC 
GGGCCACCGG CCGGCGCCGG TGGCGGC7G7 GC'ACCT 

25 

CCTTCTCTAT CTGCCACAGG AGCTAGCCTC CTG'GAC 
TGACATTGTG CAC7GCCGCA TGGCGGCCCC TAGCCAA 
30 GGTAGGCCGG TATGGCAGAC GCACAAGGC7 ~7A 7GAT 

CTCCCTTGCG CGC7T7A7^C CCAC7C7CGG 3CGGG7T 
TGAGCTTGTA GAGGCGA^GG TGGAGA-3GG CCAAGAC 

35 

TTTGTGCAGC CGAGA'GTCT CCCGCA~UAC C"™ C 
GACCGGCGAG ACAA , oCGC -~GG~-A«3~ CG3 r CA3 
4 0 CTTTTG7GCC C'G" ~GGCC CCGG" "G 'GCG-7' 

ACCACAAGC. j7j~" 3 oGG-~3G ~ G a 7 3 A C 
TGGCGCCAGC CATGC3A73G 'G""} — -^ ~GA~" T ' 
TAACTT7TCC C~AGG~:"3 ~3~G:g:;~~ "A'3GAA; 
TG7CAGG"G CGA^GGGG 'CC3G~1333 3~3GA*C! 



45 
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35 



^hGAGGGTTC 




- ■ j -J ^wjj'j^.^.j i , jCTCTGGA 


ATACGGTGTG 


1020 


GAACATGGCA 


-*:-"g::: 


- z~ ZGjGA^ L.TCAGG77G 


CCGCCTTCAA 


1080 


GGGCGA C jA^, 


-GG'ir:: 


- ~ — - - • - > -~ , „ „ ^ „ „ « r ,^,^^ r ,_^^^ 


CCGGTTCGCT 


1140 


7A7AGC AGG C 


"-G"-: 


-g"g^gg: t gac~~ccgg ccgattgggc 


TGTATGCCGG 


1200 




k:::v^z 






1260 




^cgg:gg: 


. . jCGGGCAGAG cagctccgcc 


TCGCCGTGCA 


1320 


GGAT^TC^TC 


C J ; J J - ~ 


j . ; ■.: •„ j j [ TGAGGTGG 


TGTCTAGAGT 


1380 


77ACGGGG77 


"CZZZ 3GG" 


GcTT^at^a ictga^aggc ATGCTCCAGA 


CTATTGGTGA 


1440 


TGGTAAGGC G 




ag*:tg"aa 3c:ta^act t gaccttacac 


ACTCAATTAT 


1500 


GCACCGG7C" 


GAA7GAA AA 


CA7G7GG77*" GC7GCGCCCA 7GGG77CGCC 


ACCATGCGCC 


1560 


CTAGGCCTCT 


TTTGC 






1575 



When comparing the Burmese and Mexican 
25 strains, 75.7% identity is seen in a 7189 nucleotide 
overlap beginning at nucleotide 1 of the Mexican 
strain and nucleotide 25 of the Burmese strain. 

In the same manner, a different strain of 
HEV was identified in an isolate obtained in Tashkent, 
30 U.S.S.R. The Tashkent sequence is given below (SEQ ID 
NO. 12 ) : 

SEQ ;0 NO. 12 : 



CGGGCCCCGT ACAGGTCACA ACCTGTGAGT TGTACGAGCT AGTGGAGGCC ATGGTCGAGA 60 

AAGGCCAGGA TGGCTCCGCC GTCCTTGAGC 7CGA7C7C7G CAACCGTGAC GTGTCCAGGA 120 

TCACCTTTTT CCAGAAAGAT TGCAATAAGT TCACCACGGG AGAGACCATC GCCCATGGTA 180 

40 AAGTGGGCCA GGGCATT7CG GCC~GGAG T A AGACC77C7G TGCCCTTTTC GGCCCCTGGT 240 

TCCGTGCTAT 7GAGAAGGC ATTCTGGCCC 7GC7CCC7CA GGGIGTGTTT TATGGGGATG 300 

CCTTTGATGA CACCGTCT'C TC3GC3CG" 'GGCCGCAGC AAAGGCGTCC ATGGTGTTTG 360 

45 

AGAATGACTT T7C73AG777 GAC7CCACCC AGAA"AA777 T7CCC7GGGC C7AGAG7G7G 420 

C7A77A"GGA GAAG~G7GGG A7GCCG-AG7 GGC7CATCCG CTTGTACCAC C77A7AAGG7 480 

50 C7GCG7GGA7 CC'GC-GGCC ::G-AG~~3~ ZZC'GGGAGG G7G77GGAAG AAACAC7CCG 540 

G7GAGCCCGG CAC'™:*- "G^-AC" "7GGAACA7 GGCCGT7A7C ACCCA77G77 600 



20309587 
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20 



AC GA T , . C wG 


CGA i i . uCAG 


J ' Jul ' Jv-i, I 


, AAA^ oA 


"""GATTCGATA 


GTGCTTTGCA 


660 


GTGAGTAC CG 


T^ AGAGTC ^ A 


GGGGC"" GCT3 


- CTGA "GC 


■ ^G w ; G i GGC 


TTAAAGCTGA 


720 


AGGTGGGT^T 




GGTTTG""A"!"G 


CAGG j i i GT 


GGTGACCCCC 


GGCCTTGGCG 


780 


CGCTTC CC «A 


C G : C G ■ G C G C 


TTGTCCGGCC 


Gv:^T , AC"A 


AuA ATTGG 


GGCCCTGGCC 


840 


CTGAGC GGGC 


GGAGCAGCTC 


CGC CT i aC"^G 


TGCG 






874 



10 

As shown in the following comparison of 
sequences, the Tashkent (Tash.) sequence more closely 
resembles the Burma sequence than the Mexico sequence, 
15 as would be expected of two strains from more closely- 
related geographical areas. The numbering system used 
in the comparison is based on the Burma sequence. As 
indicated previously, Burma has SEQ ID NO : 6 ; Mexico, 
SEQ ID NO: 10; and Tashkent, SEQ ID NO: 12. The 
letters present in the lines between the sequences 
indicate conserved nucleotides. 



ICv 20v 30v 40v 50v 60v 

-BURMA AGGCAGACCACATATGTGGTCGATGCCATGGAGGCCCATCAGTTTATTAAGGCTCCTGGCA 
25 GCCATGGAGGCCCA CAGTT ATTAAGGCTCCTGGCA 

-MEXICO GCCATGGAGGCCCACCAGTTCATTAAGGCTCCT6GCA 

70v 80v 90v 100v HOv 120v 

-BURMA TCACTACTGCTATTGAGCAGGCTGCTCTAGCAGCGGCCAACTCTGCCCTGGCGAATGCTG 
30 TCACTACTGCTATTGAGCA GC GCTCTAGCAGCGGCCAACTC GCCCT GCGAATGCTG 

-MEXICO TCACTACTGCTATTGAGCAAGCAGCTCTAGCAGCGGCCAACTCCGCCCTTGCGAATGCTG 

130v I40v I50v I60v 170v 180v 

-BURMA TGGTAGTTAGGCCTTTTCTCTCTCACCAGCAGATTGAGATCCTCATTAACCTAATGCAAC 
35 TGGT GT GGCCTTT CT TC CA CAGCAG TTGAGATCCT AT AA CT ATGCAAC 

-MEXICO TGGTGGTCCGGCCTTTCCTTTCCCATCAGCAGGTTGAGATCCTTATAAATCTCATGCAAC 

190v 20Cv 210v 220v 230v 240v 

-burma ctcgccagcttgttttc:gc:ccgaggttttctggaatcatcccatccagcgtgtcatcc 
40 ctcg cagct gt tt cg zz gaggtttt "ggaatca cc at ca cgtgt at c 

-MEXICO CTCGGCAGCTGG'GTTTCGTC:*3AGG T "T"'GGAATCACCCGATTCAACGTGTTATAC 

250v 260v 270v 2S0v 290v 300v 

-BURMA ATAACGAGC T GGAGCTTTACTGCCGCGCCCGCTCCGGCCGCTGTCTTGAAATTGGCGCCC 
45 ATA A GAGCT GAGC TA T G C C G GC CGCTC GG CGCTG CTTGA ATTGG GCCC 

-MEXICO ATAATGAGCTTGAGCAGTATTGCCGTGCTCGCTCGGGTCGCTGCCTTGAGATTGGAGCCC 

310v 32Cv 33Cv 3-0v 350v 360v 

-BURMA ATCCCCGCTCAATAAATGATAATCCTAATGTGGTCCACCGCTGCTTCCTCCGCCCTGTTG 
50 A CC CGC~C AT AATGA~AATC"AA'GT tcca CGCTGCTT CTCC CCC GT G 

-Mexico ac:cacgc'c:a^'aa^ga t ;a^:: t aa t g'cc"cca^:gctgctttctccaccccgtcg 
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55 



: : " :, -Cv 4I0v 420v 

-BURMA I^IY}- ' "~ ~ ■ "jCGGGCCGGCTGCTAATTGCCGGC 

3 - J ; "^ J <V~;" . ■ 3 3G :c GC GC AA TG CG C 

-Mexico g::ggg-"":--: ■ ■ . v-: •-"■.gggc-cctgcggcgaactgtcgcc 

' • ■ " -70v 480v 

-3USMA " " V ~ r ~ ; " ~ : ^-* : " A ' : ~ GCCT CGACGGGTTTTCTG 

- - - - - * ; ' ^ ... o T GA GG TTT C G 

-mexico vvr:- ~ogc:c~ac t gttttgatggctttgccg 

aIT.- 5ZCv 530v 540v 

-burma ^~5' : " : ' c :::g: :: -^ : "-'^^^' ; '^ 3 '-::'c^-ctccc7tcatgatatgtcaccat 

-gc^agac";: - go ccta tc ct catga tg cc 
-mexicc ; ^'^-3 t ^'"3::gc:3-3-:'ig':-i^c'"C'attctctccatgacttgcagccgg 

5 - v 53Cv 590v 600v 

-burma c t ga"t:gccg-gg:^*g* t c:gc:-' r irratgacgcggctctatgccgccctccatc 

c"at3~ gccgaggc - - cgcc- vg atgac cg ct tatgc gc tcca 
-mexico ctga;g'*gc:g-gg:g-"^vv:*:gc:^c::"i.:a-gacccgc:tttatgcagctttccact 

6: - v ^- ■ ^Ov 540v 650v 660v 

-burma ttccgc:tgagg"ctgc'v:::cc'ggc-c-atcgcaccgcatcgtatttgctaattc 
t cc cc gagg" ct c'gcc ccggc-c ^ tg ac catc ta ttgct at c 
-mexico t gcc'c:ag"g"g;^::-y::^:-'v.:ac: t ac:ggacatcatcctacttgctgatcc 

5^0v 5 : 0v -::-v 7 O0v 710v 720v 

-BURMA ATGACGG T -GGCr;CG T 'G"GG^GACG"-'GAGGGT3ATACTAGTGCTGGTTACAACCACG 
A GA GG'A GCGCG GT 3' AC TA'GAGGGTGA AC TAG GC GGTTACAA CA G 
-MEXICO ACGATGGTAAGCGCGCnG''G":-C'"T«"-GGGTGACACTAGCGCCGGTTACAATCATG 

; 30v 7J0v 750v ^50v 770v 780v 

-burma atgtc'ccaact t gcgc t :c t ggat'ugaaccacc-iaggttaccggagaccatcccctcg 
atgt cca c cgc 0 "gg a" -g -r ac aaggtt gg ga ca cc t g 
-mexico atgttgccac::"ccgca:-vga7:;gg^caac t aaggttgtgggtgaacaccctttgg 

? 50v SOCv 3ICv 320v 330v 840v 

-BURMA TTATCGAGCGGGTTAGGGCCiT'GGCTGCCACTTTGTTCTCTTGCTCACGGCAGCCCCGG 
T ATCGAGCGGGT GGG AT^GGCTG CACTTTGT T TTG TCAC GC GCCCC G 
-MEXICO TGATCGAGCGGGTGCGGGGTATTGGCTGTCACTTTGTGTTGTTGATCACTGCGGCCCCTG 

350v S60v ~70v S80v 890v 900v 

-BURMA ^GCCATCACC x ^GC: T 'A-G"::":CCC;:GG"C r ACCGAGGTCTATGTCCGATCGA 

agcc ~c cc a'gcc "T a 3-tc:"acc: :g ~c ac gaggtctatgtccg tc a 
-mexico agccctccccg-'gccct-cgt-'::" acc:gcg"cgacggaggtctatgtccggtcta 

"2:-. :acv 950v 960v 
-BURMA -C"CGG:::GG3"lG^r:- — — A acctcatgctccactaagtcgacct 

" : " ^ :: ^ ^ :::: c:c ACC C 'G C AAGTC AC t 
-mexico "'"ggg:::~: v: W:--; : : "v-ccgcttgtgctgtcaagtccactt 

• : " ^::v :o:cv 1020V 

-BURMA -CC-G."VC : ' v- ~-a ; ; j"~"-"GCGT*CGGGGCCACCTTGGATG 

:A " :■ a.-; G-r- a'gc' *T GGGGCCACC T GA G 

-mexico "c-:"::.:g " :::--a--- - - ^'."G'C't^gctc^tggggccaccctcgacg 
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-3USMA 
-MEXICO 



:o50v :o7Cv iosov 

- C Z C j C j G C A 77AG C TA C A AGGTC A 

*c:~:g ggcattagcta aaggt a 
;c:^ t :gtggcattagctataaggtaa 



-■EX I C 



U30v L140v 
"AGGACGCCC7CACAGC7G 
onuGA GC CTCAC GC G 

:gaggatgcgctcactgcag 



15 



-BURMA 
-MEXICO 



::ac 

u ■„ .** _ A. o ^ u 



U90v 1200v 
'C T CCGCACCCAGGCTATAT 
" 7 CG ACCCAGGC AT T 
"77GCGGACCCAGGCGATTT 



20 



25 



30 



35 



40 



45 



50 



-3URMA 
-MEXICO 

-BURMA 
-MEXICO 

-BURMA 
-MEXICO 

-BURMA 
-MEXICO 

-BURMA 
-MEXICO 

-BURMA 
-MEXICO 

-BURMA 
-MEXICO 



■„ - f* •*■ U J J 1 J - 

C A AG jo - 



4 . U J O i ij 



2Cv 124Cv 1250v 1260v 
^GCA7GCC:AGAAG777A7AACACGCCTCTACA 
3A CA7GC CAGAA 7TTAT CACGCCTCTACA 
jA^CATGCTCAGAAATTTATTTCACGCCTCTACA 



-270v 1230- L29Cv 13CCv 1310v 1320v 

GC7GGC'Z"TCGAGAAG T CC 3GCCG7GA77ACA7CCC7GGCCGTCAGTTGGAGTTCTACG 
GC73GC "7 3A3AAG7C 3G CG7GA77ACATCCC GGCCG CAG 7G AGTTCTACG 
GC7GGC'A777GA3AAG':agG t CG7GATTacA7CCCAGGCCGCCAGCTGCAGTTCTACG 



1333v 

L ^ L * o ; J L ■ J o ^ J . 

C CAG7GC G CG'. 
CTCAG7GCCGCCGC 

139Cv 
ACGAbTCGGCCCC 
A GAG7C 3 CC 
A7GAG7CAG7GCC 



:34Cv ;350v 136Cv 1370v 1380v 
r CA7C77GA7CCACGGGTGTTGGTTTTTG 
GCC3G 7T CA7CT GA CC CG TT GTTTTTG 
"GG7"ATC7GCCGGGT7CCATCTCGACCCCCGCACCTTAGTTTTTG 

NC n v :::0v l42Cv 1430v 1440v 

C7gc:a77gtaggac:gcgatccgtaaggcgctctcaaagttttgct 
t g 7g 3 -cc c a7ccg g aaa ttttgct 

TTG~AG"GCC3AACCACCA7CCGGCGGATCGCTGGAAAATTTTGCT 



1450v 1450v I470v I430v 1490v 1500v 

GC77CA7GAAG7GGC77GG7CAGGAG7GCACC7GC77CC77CAGCCTGCAGAAGGCGCCG 
G 77 ATGAAGTGGCT GG t CAGGAGTG C 7G TTCC7 CAGCC GC GA GG G 

G777'A7GAAG7GGC7CGG7CAGGAG7G77C77G777CC7CCAGCCCGCCGAGGGGCTGG 



1 5 lOv I52Cv 
7 C G G C G A C C A G G G 7 C A 7 G A ' 
GGCGACCA 33"A'GA 
CGGGCGACCA A 3G7CA7GA i 



CCGCCA' 



1533v 1540v 1550v 1560v 
"GAAGCCTATGAGGGGTCCGATGTTGACCCTGCTGAGT 
rGA GCCTA7GA GG 7C GATGTTGA C7GC7GAG 
"3AGG::*A7GAAGGC7CTGATGTTGATACTGCTGAGC 

:59Cv 160Cv 1610v 1620v 
r:C7:7GTCG7CCCTGGCACTGCCCTCCAACCGCTCT 
"■: "4 tcgt TGG C C7 CAA C 7CT 
:*:A7ACATC3*GGATGG7CGGTCTCTGCAAACTGTCT 



^33* 1 55 :. l55Cv 1670v 1680v 

-burma ^c:agg:::":g-":*:: ■:3:"g-ga"3'gg:t:3cgcgggccggctgaccgccacag 

A - ; 3CJ"3A :; - r 3C t :GCGC G CCG C7G c gc acag 

-mexicg ^'c--g:":":3-.::-g:;a3:'ga.:;-3G7agct:3:gcagc::gactgtctgctacag 
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040591 



: ' : < — ^-Ov 1 73Cv I740v 

-BURMA 'AAA 33G:^^:i-"3C3AGACCC"CTTGGTAACAAAACCT 
; ~ r : : - "G ~ 3A"GC A AC 7 T GG AA AA AC 7 

-MEXICO r ^C""-C~G~« ~C:~:~GGCC3~:~3GA"GC;aaaCAA7GA7CGGCAA7AAGACTT 

: ^- :v — -"Zv i73Qv i790v 1800v 

~ ; G A CCA A7GGCCCAGAGCGCCACAATC 



10 -mexic: 



1ACGGGCCTGAGCAGCTTAACC 



15 -MEXICO 



— ^3Cv I3acv 1850v 1360v 

burma ::":3A"c:~r - : ^g:ac t a^gg::gc t jGccc"tttcagtctcacctatgccg 

" }; : ^ ; *-"3C GC GGCCC TT G CTCACCTATGC G 

- :""3A~: 33:^3: aG"^g^:~33CAGCCGGCCCG7777GCC7CACC7A7GC7G 



•37Cv L33C-. :3?Cv igoov 1910v 1920v 

-BURMA C C T C TGC A G C " GGG C * G G - G G ' G C G C " A ' G"G C TG C C GGGC TTGAC CATCGGGCGGTTT 
CC G G GGGZ'GGA 37 C C GC GG CT GA CG G GTTT 

20 -MEXICO CCG T AGA'GGCGGG:'GGAAG'^CAT"TT'CCACCGCTGGCCTCGAGAGCCGTGTTGTTT 

1930v l94Qv 1950v 1960v 1970v 1980v 
-BURMA TTGCCCCCGGTG7T7CACCCCGGTCAGCCCCCGGCGAGGTTACCGCCTTCTGCTCTGCCC 
7 CCCC GG7 7 C CC C C CC G GAGG7 ACCGCC77C7GC7C GC C 
25 -MEXICO 7CCCCCC7GG7AA7GCCCCGAC7GCCCCGCCGAG7GAGG7CACCGCC77C7GC7CAGC7C 

1990v 2000v 2010v 2020v 2030v 2040v 
-BURMA 7A7ACAGG777AACCG7GAGGC CC AGCGCCA77CGC7GA7CGG7AAC77A7GG77CCA7C 
7 7A AGG AACCG AG CCAGCGCCA 7CG 7 A7 GG7A 77 7GG 7 CA C 
30 -MEXICO 777A7AGGCACAACCGGCAGAGCCAGCGCCAG7CGG77A77GG7AG777G7GGC7GCACC 

2050v 2060v 2C70v 2080v 2090v 2100v 
-BURMA CTGAGGGAC7CA77GGCC7C77CGCCCCGT7TTCGCCCGGGCATGTTTGGGAGTCGGCTA 
C7GA GG 7 7 GGCC7 "C C CC 7777C CCCGGGCA7G 7GG G7C GC7A 
35 -MEXICO CTGAAGGTTTGCTCGGCCTG7TCCCGCCCTTTTCACCCGGGCATGAGTGGCGGTCTGCTA 

2110v 2l20v 2130v 2140v 2150v 2160v 
-BURMA ATCCAT7C7G7GGCGAGAGCACACT7TACACCCG7ACTTGGTCGGAGGTTGATGCCGTCT 
A CCA7- 7G GGCGAGAGCAC C TACACCCG AC7TGGTC 77 G C 
40 -MEXICO ACCCA7777GCGGCGAGAGCACGCC7ACACCCGCACTTGGTCCACAATTACAGACACAC 

2170v Z130v 2190v 2200v 2210v 2220v 
-BURMA C7AG7CCAGCCCGGCC7GAC7'AGG777TA7G7CTGAGCCTTCTATACCTAGTAGGGCCG 
C C G C GGC 7 GG7 7 7G 7G C7 C C G GG C 
45 -MEXICO CC77AAC7G7CGGGCTAA7"CCGG*CA777GGA7GC7GCTCCCCACTCGGGGGGGCCAC 

2230v 22*Cv 2250v 2260v 2270v 2280v 
-BURMA CCACGCC T ACCC7GGCGGCCCC'C1"ACCCCCCCC7GCACCGGACCC77CCCCCCC7CCC7 
c C C CC 3 C C7 ~A C C C7G C C CCC C 

50 -MEXICO C7GC'AC7GCCACAG3:i:"C'G7^GGC7CG7C7GAC T C7CCAGACCC7GACCCGC7AC 

229Cv 2 3 C C v 23 10/ 2320v 2330v 2340v 
-BURMA C7GCC:C3GCGC7'GC"-GC:3GC77C7GGCGC7ACCGCCGGGGCCCCGGCCATAACTC 
crrj 0 : 3 *C'GG 3C C G G CCC C A 7 

55 -mexico C73m7G7";caga-ggc":acgc::c':7gggg:c:g7ccggc7ggccccaacccgaa7g 
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-surma 

-MEXICO 



10 



-BURMA 
-MEXICO 



-3 33v 2390v 2400v 
"-3CAC::3GATGGC7CTAAGGTATTCG 
. uA GGC CTAAG T T G 

■ cccgacggcgctaagatctatg 



Z^Cv 2450v 2460v 
"GC:3"AAC3CGTCTAATGTTGACCACC 
"GGC GT AACGC TCTAA G G CCACC 
jj^ j LAAC oCATCTAACGCCGGCCACC 



15 



-BURMA 
-MEXICO 



2 A Cv 



25COv 2510v 2520v 

-ccaaigg'accccgcctcctttgatgctg 
:a 3 ~accc g tc tttga gc 

"CAGC377ACCCTGATTCGTTTGACGCCA 



20 



25 



30 



35 



40 



45 



-BURMA 
-MEXICO 

-BURMA 
-MEXICO 

-BURMA 
-MEXICO 

-BURMA 
-MEXICO 

-BURMA 
-MEXICO 

-BURMA 
-MEXICO 



CC "GTGA-GCG 3' 



2^5Cv 255Cv 2570v 2580v 

;Cg:ggc:g:gtacacactaaccccccggccaataattc 
gccgcgta ac ct ac ccccggcc at attc 
i'ctgccgcgta'acccttacaccccggccgatcattc 



2 590v 



I ^ O ^ >_ _ L I ■ J ^ 



251Cv 2520v 2630v 2640v 
"-GG GGA ACATAACCCAAAGAGGCTTGAGGCTGCTTATC 
A GC GT G23CC 3A " 3 " T ;g-a:-taaccc AAGAGGCT GAGGCTGC TA C 
ATGCGGTGGCCCSGAC;*: 3A"3GAACATAACCCCAAGAGGCTCGAGGCTGCCTACC 

265Gv 255C 1 5 7 C v 25S0v 2690v 2700v 

gggaaac tt gccc:g:ccgg:;ccgcgcatacccgctcctcgggaccggcatatacc 
g ga accgc cc:gc: ggca.c gctgc TA cc ctc t gg c ggcat tacc 

gcgagacttgcgcc:gccgagg:a:tgctgcctatccactcttaggcgctggcatttacc 

2710v 272Cv 273Cv 2740v 2750v 2760v 

aggtgccgatcggccccagcttgacgcctgggagcggaaccaccgccccggggatgagt 
aggtgcc t g AG ga gcctgggagcggaaccaccgccc ga gag 

AGGTGCC7GTT4GTTTGAGT"TTGATGCCTGGGAGCGGAACCACCGCCCGTTTGACGAGC 

27'Gv 27= Ov 279Cv 2300v 2810v 2820v 
TGTACCTTCCTGAGCTTGCGCCAGATGGTTTGAGGCCAATAGGCCGACCCGCCCGACTC 



T TACC 7 C GAGC 



3C G TGGTTTGA CCAA G CC C CC AC 



TTTACCTAACAGAGCTGGC3GCCGGTGGTTTGAATCCAACCGCCCCGGTCAGCCCACGT 

233Cv ZS^Cv 2350v 2S60v 2870v 2880v 
TCACTATA;:T3AGGA'G"G:ACGGACAGC3AATCTGGCCATCGAGCTTGACTCAGCCA 
T A A "A a c "GAGC- A T 33 33 r GC - CTGGCC T GAGCTTGACTC G A 
TGA AC A'AACGAGGA'A * 2 33 3 CG'GCGGCCAACCGGCCCTGGAGCTTGACTCCGGGA 



50 



— -^3. 23: 3 / 292Cv 2930v 2940v 

-BURMA CAGAT3*C33C:3GGCCG*GC:3GCGTC3GGTCACCC:CGGCGTTGTTCAGTACCAGT 
GA 3' 33223 32 '3"3C23G 'GT G'C CC GGCGTTGT C GTA CAGT 
•MEXICO GTGAAGTAGGCCGCGCATG'GCCGGGTGTAAAGTCGAGCCTGGCGTTGTGCGGTATCAGT 



55 



-BURMA 
-MEXICO 



<l i Z ^ * 

TTAC T 32AGG'G 



2363. 2970v 29SCv 2990v 3000v 
"GCCG3A'3C3GCAAGTCCCGCTCTATCACCCAAGCCGATGTGGACG 

33 3 3 "3 GGCA-GTC ~C T CA GC GATGTGGA G 
"3 3 2 CG3C3TGGC A A3*2AAAGTCC3TGCAACAGGCGGATGTGGATG 
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-BURMA 
-MEXICO 



3050v 3060v 
"GCGCCGTCGCGGCTTTGCTGCTT 
"GGCG CG CG GGCTTTGC GC T 
' jGCGGCGCCGGGGCTTTGCGGCAT 



10 



15 



20 



25 



-BURMA 



-BURMA 
-MEXICO 

-BURMA 
-MEXICO 



3::cv 3i :cv 3i20v 
:c-g3gg:gc:ggg*tgtcattgatgaggctc 

-■j -G GGGT T GTCATTGATGAGGC C 

; , z . j .-'j'j'j ' ■ tj i »_ ATTGATGAGGCCC 



3-3:. 3 1 5 : v 3150v 3170v 3130v 

-burma :A T :::':::::r::::^3C'3C'3C':::cH'GCAGCGGGccGCCACCGTCCACCTTC 
c tc c~c::::: cac -g:~gc~ t ca atgcagcg gc gc c gt cacct c 
-Mexico c t 'cg:'::i::::c:c"gcgc""aca t ;tg.:agcgtgctgcatctgtgcacctcc 



; i ira C j.-* C -v. 2 G A A C : w .- G A 1 C C v. 
I !jj j^i^.j^m ^.j-i ... ^ 
; i ooo'oAC- _ 3 A A ~ ^ A G A ' 0 C C 



J c 

TCAGGC 



3210v 322Gv 3230v 3240v 
jCC A TCG A C777GAGCACGCTGGGCTCGTCCCCGCCA 
;CCA' GA TTTGAGCAC C GG CT T CC GC A 
5CCA7AGATT77GAGCACACCGGTCTGATTCCAGCAA 



32 70v 



3290v 



3300v 



CT'AGGCC: OACCTCCTGGTGGCATGTTACCCATCGCTGGCCTGCGGATG 
T GGCC GA -7 G CCC AC TC TGGTGGCATGT ACCCA CG TG CCTGC GATG 
TACGGCCGGAGTTGG'CCCGACTTCATGGTGGCATGTCACCCACCGTTGCCCTGCAGATG 



30 



35 



40 



33I0v 3320v 3330v 3340v 3350v 3360v 
-BURMA T ATGCGAGC T CATCCGTGG-GCATACCCCATGATCCAGACCACTAGCCGGGTTCTCCGTT 
T TG GAG T TCCGTGGTGC TACCC A ATCCAGAC AC AG GGT CTCCGTT 
-MEXICO TCTGTGAGT'AGTCCGTGGTGCTTACCCTAAAATCCAGACTACAAGTAAGGTGCTCCGTT 

3370v 338Cv 3390v 3400v 3410v 3420v 
-BURMA CGTTGTTCTGGGGTGAGCCTGCCGTCGGGCAGAAACTAGTGTTCACCCAGGCGGCCAAGC 

C T TTCTGGGG GAGCC GC GTCGG CAGAA CTAGTGTTCAC CAGGC GC AAG 
-MEXICO CCCTTTTCTGGGGAGAGCCAGCTGTCGGCCAGAAGCTAGTGTTCACACAGGCTGCTAAGG 

3430v 3^40v 3450v 3460v 3470v 3480v 
-BURMA CCGCCAACCCCGGCTCAGTGACGGTCCACGAGGCGCAGGGCGCTACCTACACGGAGACCA 
CCGC ACCCCGG TC T ACGGTCCA GAGGC CAGGG GC AC T AC AC A 
-MEXICO CCGCGCACCCCGGATCTATAACGGTCCATGAGGCCCAGGGTGCCACTTTTACCACTACAA 



45 



3490v 3500v 
-BURMA C TA T7 A 77G C C A C A GC A. G A " 
CTAT AT - ^ AC GCAGA1 
-MEXICO CTATAATTGCAAC7GCAGA" 



3510v 3520v 3530v 3540v 
GCCCGGGGCCT7AT7CAGTCGTCTCGGGCTCATGCCATTG 
GCCCG GGCCT AT CAGTC TC CGGGCTCA GC AT G 
GCCCGTGGCCTCA'ACAGTCCTCCCGGGCTCACGCTATAG 



50 



55 



-BURMA 
-MEXICO 

-BURMA 
-MEXICO 



33=Cv 3550- 35?Cv 3530v 3590v 3600v 
TTGC t CT'GACGCGCCACa:-GAGAAGT3CG t CA t CAT t GACGCACCAGGCCTGCTTCGCG 
«~ 7 G GT AT TTGAC C CC GGCCTG T CG G 
;-:a~GTGTTATAC7TGACTCTCCCGGCCTGTTGCGTG 



'6 :"v 



^GG o^'jC 

*GG~Gu''; 

* G G ■ j u G ' 



35AOv 3650v 3660v 
-"CCTCGCTGGTGGCGAAATTGGTC 
r "OCT C GGTGGCGA TTGGTC 
:C"CCT77CGGGTGGCGAGGTTGGTC 
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•BURMA 

-mexic: 



* U v. J - . 



J ' - U v 
- , , 



3710V 
7 r ,7Tr, 



3720v 
377GACACCCTGGCTG 



--^.jAC CAA7GTTGAC CT GC G 

: c:c~ga::gcaa t g7tgacgtgcttgcgg 



10 



15 



20 



25 



30 



35 



40 



45 



50 



-BURMA 
-MEX i:3 

-BURMA 
-MEXICO 

-BURMA 
-MEXICO 

-8URMA 
-MEXICO 

-BURMA 
-MEXICO 

-TASHKENT 
-BURMA 
-MEXICO 



-MEXICO 



3770v 3730v 

"ggctgaggagcttggccaca 
* gc^gaggagct ggccac 
"gc^gaggagctgggccacc 



J^* ■- - J _ _ J , J „ - J _ 

GG C C 3 3 C 3 C 0 3 3 ~ G G 1 3 3 T 



A T u T G C 0 A C - G G - G C 



; c ^ J v 

::::gagc 



-j- ^ ^ : j^ai 



383Cv 3840v 
r C3AACAGGGCCTTCTCT 
; GA CAGGGCCTTCTCT 
"TGAGCAGGGCCTTCTCT 



" O 1 J 1 



35gCv 3890v 3900v 
:3'AACA7T T GAATTAACAGACATTG 
37 ACATTTGA TAAC GACATTG 
"G'GACATTTGAGCTAACTGACATTG 



393Cv 3940v 3950v 3960v 
TGCAC T GCCGCATGGCCGC:CCGAGCCAGCGCAAGGCCGTGCTGTCCACACTCGTGGGCC 
- AGCCA G AA GC GT TGTCCAC CT GT GGCC 
ATGGCGGCCOCTAGCOAAAGGAAAGCTGTTTTGTCCACGCTGGTAGGCC 



i oCACTGCCGCATGGC 
TGCACTGCC 



_ 3970v 39SC-/ 3990v 4CC0v 4010v 4020v 

GC^ACGGCGGTCGCACA-;GC T CTACAA7GCTTCCCACTCTGATGTTCGCGACTCTCTCG 
G 'A GGC G CGCACAA. GC T "A ATGC CAC C GATGT CGCG CTC CT G 

GGTATGGCAGACGCACAAGGC T TTA'GATGCGGGTCACACCGATGTCCGCGCCTCCCTTG 



4030v 



4040v 



CC C G T i 
C CG 7 
CGCGC7 



4060v 4070v 4080v 
GGCCCCGTACAGGTCACAACCTGTGAGTTGTACGAGCTAG 
GGCCCCGTACAGGT ACAAC TGTGA TTGTACGAGCTAG 

ggc:::g'acaggttacaacttgtgaattgtacgagctag 
g3 c g~ g ac ac tgtgaa t t gagct g 

cccactc-cg-3gc3ggttactgccaccacctgtgaactctttgagcttg 



-TASHKENT T 



-BURMA 



4100v aiiov 4120v 4130v 4140v 

gg t cgaga:agg::agga*ggctccgccgtccttgagctcgatctctgca 



dO90v 

, „ p n „ „ 

tggaggccatggtcgaga: ggccaggatggctccgccgtccttgagct gatct tgca 

tggaggccatggtcgagaagggccaggatggctccgccgtccttgagcttgatctttgca 
t gaggc atggt gagaagggcca ga gg tc gccgtcct gag t gat t tgca 

tagaggcgatggtggagaagggccaagacggttcagccgtcctcgagttggatttgtgca 



4150v 41fi0v 
-TASHKENT A C C G7 G A C G T 3 ~C C A GGA " 

accgtgacgtgtccagga 
-burma acogtgacg 

iCG 3 A G 
-MEXICO GCC3AGAT370TCCCGCA 



4170v 



J 1 L ! .Moyf 

37CCAGG' 3 
- - r r , 



CACCT 



4130v 4190v 4200v 
"7C:aGAAAGA77GCAA7AAGTTCACCACGGGAG 

ttccagaaagattg aa aagttcaccac gg g 
"ccagaaaga7tgtaacaagttcaccacaggtg 
"c:agaa gattgtaacaagttcac ac gg g 
":;agaaggat7G7aacaagttcacgaccggcg 
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---Cv 4250v 426Gv 

-Tashkent aga 1 : ^ :g : : : - ~GG~--~.r gg : : : a g 33 : a ™:ggc:~ggagtaagaccttctgtg 
-ga::a" i:::-"G"---G*^:-G::-r I GG-:-T':33CCTGGAG aagaccttctg g 

-burma -ga::i"g:::a*g3'---g*ggg::-gggca t c':3gc:tggagcaagaccttctgcg 
agac a — gc :a"g -aag' 33 c^ggg atc" ctggag aagac tt tg g 

-mexico ^ga:aa"3C3:a:3gc-aagtcgg"cagggtat:'tccgctggagtaagacgttttgtg 



Or KEN ^ 



J isOv urgcv 43C0v 4310v 4320v 
"" ':33:::C^:3"::3 t 3: t ^"3'13--GGC^mTTCTGGCCCTGCTCCCTCAGG 

10 cc - " ^::c '33"c:g gctattgagaaggc^attctggccctgctccctcagg 

-BURMA CCC T :^"33C:r':G":CGC3C-T'3AGAAGGCTATTCTGGCCCTGCTCCCTCAGG 

:c:~ i3cc: '3G"::g gc a—gagaaggctattct ccct t cc ca g 
-mexico c::*3T"G3::::'3G"":c373C3a"3agaaggctattctatcccttttaccacaag 

15 a 32Cv 4]::., 435.3, 436Gv 4370v 4380v 

-TASHKENT 'jT3T3T m AT33GGA-3::" T GA'3iCACC3TCTTCTCGGCGCGTGTGGCCGCAGCAA 
, _ — : ^- : --::c:3TC"CTCGGCG TGTGGCCGCAGCAA 

"mC3C-C-*3C:"'3AT3ACAc:gtc t TCTCGGCGGCTGTGGCCGCAGCAA 



-BURMA 
20 -MEXICO 



'CAGTATTCTCTGCTGCCGTGGCTGGCGCCA 



4390v J4CC-/ -a;Cv 442Gv 4430v 4440v 
-TASHKENT AGGCGTCCATGGTGTTT3A3AATGACTTTTCTGAGTTTGACTCCACCCAGAATAATTTTT 
AGGC TCCATGG t GTT-G4GAA t GAC"t TTCTGAGTT tgaCTCCACCCAGAATAA TTTT 
25 -BURMA AGGCATCCATGGTGTT'GiGAATGACTTTTCTGAGTTTGACTCCACCCAGAATAACTTTT 

CCATGG'GTTTGA AATGA TTTTCTGAGTTTGACTC AC CAGAATAACTTTT 
-MEXICO GCCATGCCATGGTGTTTGAAAATGATTTTTCTGAGTTTGACTCGACTCAGAATAACTTTT 

4450v 4460v 4470v 4480v 4490v 4500v 

30 -TASHKENT CCCTGGGCCTAGAGTGTGCTATTATGGAGAAGTGTGGGATGCCGAAGTGGCTCATCCGCT 

C CTGGG CTAGAGTGTGCTATTATGGAG AGTGTGGGATGCCG AGTGGCTCATCCGC 
-BURMA CTCTGGGTCTAGAGTGTGCTATTATGGAGGAGTGTGGGATGCCGCAGTGGCTCATCCGCC 

C CT GGTCT GAGTG GC ATTAT3GA GAGTGTGG ATGCC CAGTGGCT TC G 
-MEXICO CCCTAGGTCTTGAGTGCGCCATTATGGAAGAGTGTGGTATGCCCCAGTGGCTTGTCAGGT 

35 

45i0v J52Gv 453Cv 4540v 4550v 4560v 

-TASHKENT tgtaccaccttataaggtctgcgtggatcctgcaggccccgaaggagtccctgcgagggt 
TGTA CACCTTATAAGGTCTGCGTGGATC tgcaggccccgaaggagtc ctgcgagggt 
-burma tgtatcacc t tataag^c t gcgtggatcttgcaggccccgaaggagtctctgcgagggt 

40 T GTA CA T GGTC GCGTGGATC TGCAGGCCCC AA GAGTCT TG GAGGGT 

-MEXICO TGTACCATGCCGTCCGG^CGGCGTGGATCCTGCAGGCCCCAAAAGAGTCTTTGAGAGGGT 

45 7 0v 4530v 4590v 4600v 4610v 4620v 
•TASHKENT GT^GGAAGAAACAC'CCGG'GAGCCCGGCAC'C^^CTATGGAATACTGTCTGGAACATGG 
45 TTGG A AG A A4C a C'CC 3 3 '3 4GCCC3GC a C^CTTCTATGGAATACTGTCTGGAA ATGG 

-BURMA ™GGAAGAAACAC'::3G"3AGCCC3GCACTCTTCTATGGAATACTGTCTGGAATATGG 
T TGGAAGA A CA TC 3G7GAGCC G3CA T CT TGGAATAC GT TGGAA ATGG 
-MEXICO T C7GGAAGAAGCAT':'33'3AGC:3G3CA3C^ T GC"CTGGAATACGGTGTGGAACATGG 

50 dl 53Cv 4n40v 455Cv 4660v 4670v 4680v 

-TASKENT CCGTTATCACCCAT""ACGATTTCCGCGATTTGCAGGTGGCTGCCTTTAAAGGTGATG 
CCGTTAT ACCCA ""A 3A 7"CGCGA7TT AGGTGGCTGCCTTTAAAGGTGATG 
-BURMA CCGT-ATTACCCAC'3"ATGAC TT CCGCGATTTTCAGGTGGCTGCCTTTAAAGGTGATG 
C T AT' CCCA T 3 "ATGA ~ T CCG GA T CAGGT GC GCCTT AA GG GA G 
55 -MEXICO CAATCAT'3CCCA T *3C'A'3 * G"CCGGGACCTCCAGGTTGCCGCCTTCAAGGGCGACG 
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10 



15 



20 



25 



30 



: - - - < -"^ 473Cv 4740v 

-Tashkent a": 3': - ;~ r~::y:^~r ;:-33Ggc t gctgtcctgattgctg 
M"C].i'-G'G-:""v:'."'"-r- :g~:-g-g~::agg gc'gctgtcctgat gc g 

-burma at-: ]A'iG'G:"'G ":g* :« g-g~ : oaggagctgctgtcctgatcgccg 

a — ] * g* :~ " :g :-g.-g ::agg gc g t ct at gc g 

-mex:co a:-:vg~:3~::~:~g*- ~-o:gc:^:g:c:aggcgccggttcgcttatagcag 

-oCv ,~7C* -73G v 4790v 4800v 

- TAS H i* c N i o ^ . j . o j v- ■ ~ - *» j: _ j-^-jj jjj . _ j _ ^ j -*. ju:>oi ATGCAGGTGTTGTGG 

GC'G'GSC" -~G ~ 3 - : ^ : ' "i "~GI3 ICGAT 3G~7TGTATGCAGGTGTTGTGG 
-BURMA GCT3"GG:'T*GAAG"G--rr^GA"':;G:::3A'::3G"TGTATGCAGGTGTTGTGG 

GC'G'GG ~ T G--G~ .^--GG G- "COG CCGA T GG 7GTATGC GG GTTGT G 
-MEXICO GC'G'3G"'3,— 3"3 — 33C"3-:"C:33::3A TT 3GGCTGTATGCCGGGGTTGTCG 

4810v 4.520. -iSGCv 434Cv 4850v 4860v 

-tashkent t gac icc i ogc g"ggcgggg"cc0gac3tcg7gcgcttgtccggccggcttactgaga 
tg cccc:ggcg"ggcgcgc"cicga gt g~gcgcttg ccggccggcttac gaga 
-burma tggczcccggcctggcgcgcccctgatgttgtgcgcttcgccggccggcttaccgaga 

T GCCCC 33 07 33 31 0' 00 GA'GT GT CG TTCGCCGG CGGCTT C GAGA 
-MEXICO TCGCCCCjGGGC t CGGGGCCC t ACCCGATGTCGTTCGATTCGCCGGACGGCTTTCGGAGA 

4870v 4SS0-.- 4890v 490Cv 4910v 4920v 
-TASHKENT AGAATTGGGGCCC'GGCCC'GAGCGGGCGGAGCAGCTCCGCCTTGCTGT 

AGAATTGGGGCCC'GGCCC'G-GCjGGCGGAGCAGCTCCGCCT GCTGT 
-BURMA AGAATTGGGGCCC'GGCCC'GAGCGGGC jGAGCAGCTCCGCCTCGCTGTTAGTGATTTCC 

AG A A "GGGG CC'G CO G-GCGG3C GAGCAGCTCCGCCTCGC GT GATTTCC 
-MEXICO AGAAC'GGGGGCC'G-*:: "GAGCGGGC.-GAGCAGC'CCGCCTCGCCGTGCAGGATTTCC 

4930v -^Ov 4950v 4960v 4970v 4980v 
-BURMA TCCGCAAGCTCACGAATGTAGCTCAG-TGTGTGTGGATGTTGTTTCCCGTGTTTATGGGG 
TCCG A G T ACGAA'G" GC CAGA" TGTGT GA GT GT TC G GTTTA GGGG 
-MEXICO TCCGTAGGTTAACGAATG7GGCCCAGAT T TGTGTTGAGGTGGTGTCTAGAGTTTACGGGG 



35 



4990v 50C0v 501Cv 5020v 5030v 5040v 
-BURMA TTTCCCCTGGAC"CGT T C ATAACC~GATTGGCATGCTACAGGCTGTTGCTGATGGCAAGG 
TTTCCCC GG CT GT'C ATAACCTGAT GGCATGCT CAG CT TTG TGATGG AAGG 
-MEXICO TTTCCCCGGG T CTGGTTOATAACCTGATAGGCATGCTCCAGACTATTGGTGATGGTAAGG 



40 



505Cv =060v 507Cv 5080v 5090v 5100v 
-BURMA CACATTTCACTGAGTCAG""AAAACCAGTGCTCGACTTGACAAATTCAATCTTGTGTCGGG 

C CAT" AC GAGTC 3" ^ CC T CT GAC T ACA A TCAAT TG CGG 
-MEXICO CGCATTTTACAGAGT.:'G"AAGC:TATAC'TGACCTTACACACTCAATTATGCACCGGT 



45 



50 



511Cv 5120 513Cv 5140v 5150v 5160v 

-burma tggaa-g—'— c-'3-:""3C'3:g:c:-3ggt':gcgaccatgcgccctcggcct 
gaa'gaa'aaca'r gcgcgcgg^gggt^cgc accatgcgccct ggcct 
-mexico ctgaatgaataaca"*gg"'"c t 3cgc:ca'gggttcgccaccatgcgccctaggcct 

5I70v 5132 v 5190V 52C0v 5210v 5220v 
-BURMA ATTT""3C'3C"C:"0-'G*"""G::t;-3CT3CCCGCGCCACCGCCCGGTCAGCCG 
TTT'G "G ~G ~0G~: T"G"~ "GCCATG 'GOCCGCGCCACCG CCGGTCAGCCG 
-MEXICO CTTTTGCT3 T "3"::""*G"":"3CC T -iT3T'3CCCGCGCCACCGACCGGTCAGCCG 
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10 



15 



52 20. 52-:* 52f 525Cv 5270v 5280v 

-Burma ^c'gg:::: rs^cG'vi'V::; j:gc-gc3g:ggttc:ggcggtggtttctggggtgaccgg 
-:t3Gc:g::g-:g"g;:3g:3:-3: ggigg:' ccggcjGtggtttctggggtgaccgg 
-mexico *:'ggc:g::g" : :g'ggg::gcg:-g:3Gcggtac:3gc5Gtggtttctggggtgaccgg 

5ZCO 53r/ 53:%' 5320v 5330v 5340v 

-surma g^"3-*":":-g:::"":3c— "::::^a--'cat::aaccaaccccttcgcccccgat 
G**GA'-:*:AGc:r":~:--*:::r-TiT-:A-:cAAccAAccccTT gcccc ga 
-mexico g"g-":*:-3-j:: t ' ■at-:a'::aac:aac:cctttgccccagac 

535:, 535:. 537;v 53cCv 5390v 5400v 
-BURMA G~:AC:3C~3C3G::3333;"GACr:G~3"CGC:AACCCGCCCGACCACTCGGCTCC 



;AC:"3 "CGCCAACC GCCCG CCACT GGCTCC 

-mexico g t, '3::3:'gc3'::333 t :"33-c":3c:"cgc:aaccagcccggccacttggctcc 



541Cv 542Cv 5 4 3 C v 5440v 5450v 5460v 

20 -burma gc t '3gcg'gaccagg:::-g:3ccccgc:gttgcctcacgtcgtagacctaccacagct 

cttggcg ga caggcccagcgcccc ccg tgcctc cgtcg gacct ccacagc 

-MEXICO ACTTGGCGAGA7CmGG:CCAGC3CCCCTCCGCTGCCTCCCGTCGCCGACCTGCCACAGCC 
5^7Cv 5430v 5490v 5500v 5510v 5520v 

25 -BURMA ggggccgcgccgc t aa:cgcgg;cgctccggcccatgacaccccgccagtgcctgatgtc 

GGGGC GCG CGCT AC GC GT GC CC GCCCATGACACC C CC GT CC GA GT 
-MEXICO GGGGC TGCGGCGCTGACGGCTGTGGCGCCTGCCCATGACACCTCACCCGTCCCGGACGTT 

553Cv 5540v 5550v 5560v 5570v 5580v 
30 -BURMA GAC7CCCGCGGCGCCATCTTGCGCCGGCAGTATAACCTATCAACATCTCCCCTTACCTCT 

GA TC CGCGG GC AT T CGCCG CAGTATAA T TC AC TC CCCCT AC TC 
-MEXICO GAT T CTCGCGGTGCAArTC T ACGCCGCCAGTATAATTTGTCTACTTCACCCCTGACATCC 

5590v 56C0v 55LOv 5620v 5630v 5640v 
35 -BURMA TCCGTGGCC ACCGGCAC T AACC T GGTTCTTTATGCCGCCCCTCTTAGTCCGCTTTTACCC 

TC GTGGCC C GGCATAA - GT CT TATGC GCCCC CTTA TCCGC T T CC 
-MEXICO TCTGTGGCCTCTGGCACTAATTTAGTCCTGTATGCAGCCCCCCTTAATCCGCCTCTGCCG 

5650v 56-Ov 5670v 5680v 5690v 5700v 
40 -BURMA CTTCAGGACGGCACCAATiCCCATATAATGGCCACGGAAGCTTCTAATTATGCCCAGTAC 

CT CAGGACGG AC AATAC CA AT ATGGCCAC GA GC TC AATTATGC CAGTAC 
-MEXICO CTGCAGGACGGTACTAATACTCACATTATGGCCACAGAGGCCTCCAATTATGCACAGTAC 

57I0v 5720v 5730v 5740v 5750v 5760v 
45 -BURMA CGGGTTGCCCGTGCCAC^TCCGT'ACCGCCCGCTGGTCCCCAATGCTGTCGGCGGTTAC 

CGGG"GCCCG GC Af A'CCGTTACCG CC CT GT CC AATGC GT GG GG TA 
-MEXICO CGGGTTGCCCGCGC'AC-i'CCGTTACCGGCCCCTAGTGCCTAATGCAGTTGGAGGCTAT 

577C/ 57SCv =79Cv 55C0v 5310v 5820v 
50 -BURMA GC C A TCTC C ATf TC a "C "33 CCACAGACCACCAC CACCCCGACGTCCGTTGATATGAAT 

GC AT ":at TC T*C*3GCC CA AC ACCAC ACCCC AC TC GTTGA ATGAAT 

-mexico gc t a~:—:a— :~- :*33:r :a aacaaccacaacccctacatctgttgacatgaat 
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-BURMA 
-MEXICO 

-BURMA 
-MEXICO 

-BURMA 
-MEXICO 

-BURMA 
-MEXICO 



-BURMA 



TCAATAAC: 
~C AT AC 



ATCC^ A AG'3 A j C3CC"*' 
A i ^^A~o^3A*jujC~ 
5 9 5 0 v 555; 

ij i J'Jv. : vj^G^AG'J^OCL 

GT GCjAGGAGGA GC 
G TT G C ' G a G G A G G A a G 0 ' 

5c::v 5::- 

AATTCCTA-iCAA'A;::. 
AA 'CC'ATAC A^-c 
AAC T CCTA^ACC-:-':C r 



:=6Cv 53/Ov 5880v 
---t :3 - : AG'CCA 3CCCGGCATAGCCTCTGAGCTTGTG 

*r g ; gt :a :c ggcatagc tctga t gt 

"G'C AGGA I"G' T CA ACCTGGCATAGCATCTGAATTGGTC 

> 59 1 D / 592Cv 5930v 5940v 
-CACTA'CGTiACCAAGGCTGGCGCTCCGTCGAGACCTCTGGG 

lAC'A ;g a.: :aagg tggcgctc GT GAGAC tctgg 

~*C Av, w jCAATCAAGGTTGGCGCTCGGTTGAGACATCTGGT 
5 9 7 C y 5930v 5990v 6000v 

'-■:: t "ggt:"gt-at3ctttgcatacatggctcactcgta 
acc~: gg t :""G" atg t tgcatacatggctc c gt 
:a::'::3g t :"g~:a-"atgcatacatggctctccagtt 

^i:* 5:ACv 6050v 6060v 

:c:'a-accgg"gc:ct:gggctgttggactttgcccttgag 
'a*;;:gg-gcc:t gg t tggactttgcc t gag 
'gccct t ggcttactggactttgccttagag 



SO'Cv 5C5C ■ 5C9Cv 5100v 6110v 6120v 

cttgagt t tcgcaac:tt-c:c::gg'aaca.ccaa-acgcgggtctcccgttattccagc 
cttgagtttcgcaa ct acc cc g t aacaccaatac cg gt tcccgtta tccagc 
-Mexico cttgagtttcgcaatctcaccacctgtaacaccaatacacgtgtgtcccgttactccagc 



6I30v SlACv 615Cv 6150v 6170v 6180v 

-BURMA actgctcgccaccgccttcgtcgcggtgcggacgggactgccgagctcaccaccacggct 
actgctcg cac c cg g g gacgggactgc gagct accac AC GC 
-MEXICO actgctcgtcactccgcccgaggggcc---gacgggactgcggagctgaccacaactgca 

6190v 52C0v 6210v 6220v 6230v 6240v 

-burma gctacccgctttatgaaggacctctattttactagtactaatggtgtcggtgagatcggc 
GC ACC G tt atgaa ga ctc a tttac g taatgg gt ggtga tcggc 
-Mexico gccaccaggttcatgaaagatctccactttaccggccttaatggggtaggtgaagtcggc 



40 



6250v 5260v 6270v 6280v 6290v 6300v 

-burma cgcgggatagccctcaccctgttcaaccttgctgacactctgcttggcggcctgccgaca 
cgcgggatagc CT AC T T aaccttgctgacac ct ct ggcgg ct ccgaca 
-Mexico cgcgggatagctctaacat-acttaaccttgctgacacgctcctcggcgggctcccgaca 



45 



63I0v 632C 

-burma gaat7gatttcgtcgg" 
gaatt atttcgtcggc 
-mexico gaa^-aat—cg'CGgc 



;330v 

- AGuTG t 



t o i 



53^0v 6350v 5360v 
'CTACTCCCGTCCCGTTGTCTCAGCCAAT 
" ~A TCCCG CC GTTGTCTCAGCCAAT 
""ATTCCCGCCCGGTTGTCTCAGCCAAT 



50 



6370v 538Cv 539Cv 54G0v 6410v 6420v 
-BURMA GGCGAGCCGACT3'"AAG"GTATACATCTGTAGAGAATGCTCAGCAGGATAAGGGTATT 
GGCGAGCO AC GT AAG * TA^ACA'3 GT GAGAATGCTCAGCAGGATAAGGGT TT 
-MEXICO GGCGAGCCAACCGTGAAGC'CTATACATCAGTGGAGAATGCTCAGCAGGATAAGGGTGTT 



55 



-BURMA 
-MEXICO 



6430v k 
b l A A ^ ^ C G C A ' j A C 
G C A T C C 0 A o A 
GC7A7CCCCIACGAT 



5450v 5460v 6470v 6480v 

:ggagaa'ccg t gtggttattcaggattatgataac 
gg 3a 7c cgtgtggt attcaggattatga aac 

'GG t 3AT~CGCGTGTGGTCATTCAGGATTATGACAAC 
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-BURMA Z~~: 
-MEXICO CAGC 



-BURMA 
-MEXICO 

-BURMA 
-MEXICO 

-BURMA 



C G A G C AA'GA'G'GC 



^i'Jv 6530v 6540v 

":agc:cca^cgcgccctttctctgtcctt 
gc ccatc cg ccttt tctgt ct 

3CCGCGCCATCTCGGCCTTTTTCTGTTCTC 

5530/ 6590v 6600v 
CACCGC'GCCGAGTATGACCAGTCCACTTAT 

:ac gc gccgagtatgaccagtccactta 
-acgcagccgagtatgaccagtccacttac 



juu j - v_ C . G G i v-^j 



ao L j l o L * oij l L j .j ^ ^ 'j 

bGC jCGC-GGCC 3"T gcccg 
-MEXICO GGCGCGCAGGCCGTAGCCCGA 



5530v 5540v 6650v 6660v 
r"ATGT T 'C"AC^07GTGACCTTGGTTAATGTTGCGACC 
"AT - TC GAC GTGAC TTGGT AATGTTGCGAC 
''ATATCTCGGACAGCGTGACTTTGGTGAATGTTGCGACT 

5690v 5700v 6710v 5720v 

-*cgccgat'ggac:aaggtcacacttgacggtcgcccc 
3a tgg ccaa gtcac ct gacgg cg ccc 

ggtccaaagtcaccctcgacgggcggccc 



573Cv 

-BURMA CTCTCCACCA' 
CTC C AC T 
-MEXICO CTCCCGACTGT 



5:4 °' 67 ^' v 6750v 6770v 6780v 

:cagc-g t actcgaagaccttctttgtcctgccgctccgcggtaagctc 

AGCA T A TC AAGAC TTCTTTGT CT CC CT CG GG AAGCTC 

-gagcaatat-ccaagacattctttgtgctcccccttcgtggcaagctc 



^5790v 6300V 6810v 6820v 6830v 6840v 

-8urma t ctt'ctgggaggcaggcacaactaaagccgggtacccttataattataacaccactgct 
tc tt tgggaggc ggc-caac aaagc gg ta ccttataattataa ac actgct 

-MEXICO T CCTTTTGGGAGGCCGGCACAACAAAAGCAGGTTATCCTTATAATTATAATACTACTGCT 



-BURMA 



6850v 6860v 6370v 6880v 6890v 6900v 

AGCGACCAACTGCTTGTCGAGAATGCCGCCGGGCACCGGGTCGCTATTTCCACTTACACC 
AG GACCA T CT T GA AATGC GCCGG CA CGGGTCGC ATTTC AC TA ACC 

-MEXICO AGTGACCAGATTCTGATTGAAAATGCTGCCGGCCATCGGGTCGCCATTTCAACCTATACC 

6910v 6920v 6930v 6940v 6950v 6960v 
-BURMA ACTAGCCTGGGTGCTGGTCCCGTCTCCATTTCTGCGGTTGCCGTTTTAGCCCCCCACTCT 

AC AG CT GG GC GGTCC GTC CCATTTCTGCGG GC GTTTT GC CC C CTC 
-MEXICO ACCAGGC7TGGGGCCGG7CCGGTCGCCATTTCTGCGGCCGCGGTTTTGGCTCCACGCTCC 



-BURMA 



6970v 6980v 6990v 7000v 7010v 7020v 

GCGCTAGCATTGCTTGAGGATACCTTGGACTACCCTGCCCGCGCCCATACTTTTGATGAT 
GC^CT GC 'OCT 3AGGATAC TT GA TA CC G CG GC CA AC TTTGATGA 

-MEXICO GCCCTGGCTC-GCTGGAGGATACTTTTGATTATCC5GGGCGGGCGCACACATTTGATGAC 



-BURMA TTCTGu CCAGAG'GC Z 3C C Z 1 1 

-Mexico TTCTGc:cTGAiTGc::-:G-:*' 



7050v 
"j ^ ^ , 



7060v 7070v 7080v 

rcagggctgcgctttccagtctactgtcgct 
caggg "g gctttccagtc actgtcgct 
-agggttgtgctttccagtcaactgtcgct 



709Cv 

-BURMA GAGC'TCAGCGCC 
GAGC" CAGCGC 0 

-mexico gag;~::ag:gcc 



7 110v 7I20v 7130v 7140v 

"aagatgaaggtgggtaaaactcgggagttgtagtttatttgcttg 
"AA - aaggtggg t aaaactcgggagttgtagtttatttg tg 
-aag*'aaggtgggtaaaactcgggagttgtagtttatttggctg 



4; . 



-burma tgccc:::" 
-hexico *Gc::-cr 



'I'Cv 71SOv 7190v 
"""CATTTCTGCGTTCCGCGCTCCC 
'": TTTCT GT CCGCGCTCCC 
'"::"TTTC*CGGTCCCGCGCTCCC 



10 



-BURMA 



-MEXK 
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:ing frames, which are 
been found within the 
As has already been 



A number of ~ r e r. re; 
potential coding regions , havs 
DNA sequences set forth above 

noted, consensus residues for the RNA-directed RNA 
polymerase (RDRP) were identified in the HEV (Burma) 
strain clone ET1.1. Once a contiguous overlapping set 
of clones was accumulated, it became clear that the 
nonstructural elements containing the RDRP as well as 
what were identified as consensus residues for the 
helicase domain were located in the first large open 
reading frame (ORFI). ORFI covers the 5' half of the 
genome and begins at the first encoded met, after the 
27th bp of the apparent non-coding sequence, and then 
extends 5079 bp before reaching a termination codon. 
Beginning 37 bp downstream from the ORFI stop codon in 
the plus 1 frame is the second major opening reading 
frame (ORF2) extending 1980 bp and terminating 68 bp 
upstream from the point of poly A addition. The third 
forward ORF (in the plus 2 frame) is also utilized by 
HEV. ORF3 is only 370 bp in length and would not have 
been predicted to be utilized by the virus were it not 
for the identification of the immunoreactive cDNA 
clone 406.4-2 from the Mexico SISPA cDNA library (see 
below for detailed discussion). This epitope 
confirmed the utilization of ORF3 by the virus, 
although the means by which this ORF is expressed has 
not yet been fully elucidated. if we assume that the 
first me^ is utilized, 0RF3 overlaps ORFI by 1 bp at 
its 5' end and 0RF2 by 32S bp at its 3 'end. ORF2 
contains the broadly reactive 405.3-2 epitope and also 
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a signal sequence at its extreme 5' end. The first 
half of this 0RF2 also has a high pi value (>10) 
similar to that seen with other virus capsid proteins. 
These data suggest that the 0RF2 might be the 
5 predominant structural gene of HEV. 

The existence cf subgenomic transcripts prompted 
a set of experiments to determine whether these RNAs 
were produced by splicing from the 5' end of the 
genome. An analysis using subgenomic probes from 
10 throughout the genome, including the extreme 5' end, 
did not provide evidence for a spliced transcript. 
However, it was discovered that a region of the 
genome displayed a high degree of homology with a 21 
bp segment identified in Sindbis as a probably 
15 internal initiation site for RNA transcription used in 
the production of its subgenomic messages. Sixteen of 
21 (76%) of the nucleotides are identical. 

Two cDNA clones which encode an epitope of HEV 
that is recognized by sera collected from different 
20 ET-NANB outbreaks (i.e., a universally recognized 

epitope) have been isolated and characterized. One of 
the clones immunoreacted with 8 human sera from 
different infected individuals and the other clone 
immunoreacted with 7 of the human sera tested. Both 
25 clones immunoreacted specifically with cyno sera from 
infected animals and exhibited no immunologic response 
to sera from uninfected animals. The sequences of the 
cDNAs in these recombinant phages, designated 406.3-2 
and 406.4-2 have been determined. The HEV open reading 
30 frames are shown to encode epitopes specifically 

recognized by sera from patients with HEV infections. 
The cDNA sequences and the polypeptides that they 
encode are set forth below. 

Epitopes derived from Mexican strain of HEV: 
35 406.4-2 sequence (nucleotide sequence has SEQ ID 

NO. 13; amino acid sequence has SEQ ID NO. 14): 
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SEQ ID NO. 13 ; 



10 



C 3CC A A C C A G C C C 1 G C "AC ' T j 
Ala Asn G 1 " ?, ~c I'y « 1 s _e_ 
1 5 



: t t ggc gag atc agg ccc 

.eu Gly Glu lie Arg Pro 
10 15 



agc gcc cr 



ber h 



'a 3 r n ? ri 



..j :cc G'c gcc gag ctg cca cag ccg ggg ctg 

_eu =r 0 73' A'3 Asp Leu Pro Gin Pro Gly Leu 
20 ^5 30 



CGG CGC TGA C2GC"TGGC GCCTGCCCA* GACACC7CAC CCGTCCCGGA 
Arg Arg 



46 



94 



143 



15 


CGTTGATTCT 


r n f r " ~ 


GCAA ~ T C T ACGCCG 


CC AG7ATAAT 


"G7C7AC7T 


CACCCCTGAC 


203 




ATCCTCTGTG 




jul A ' w 1 mA. i ■ . 


G:A7GCA 


GCCCCCC77A 


A7CCGCCTC7 


263 


20 


GCCGCTGCAG 


GACGGT 




A ~GGC CACA 


GAGGCC7CCA 


A77ATGCACA 


323 


GTACCGGGTT 


GCCCGC 


oC7A CTATCCGT'A 


ICGGCCCC7A 


G7GCC T AA7G 


CAG7TGGAGG 


383 




CTATGCTATA 




TCTT 7C7GGCCCA 


-"CAACCACA 


ACCCC7ACA7 


C7G77GACAT 


443 


25 


GAATTC 












449 



30 



35 



40 



45 



50 



SEQ ID NO, 14 : 

Ala Asn Gin Pro Gly His Leu Ala Pro Leu Gly Glu He Arg Pro Ser 
15 10 15 

Ala Pro Pro Leu Pro Pro Val Ala Asa Leu Pro Gin Pro Gly Leu Arg 

20 25 30 

Arg . 

406.3-2 sequence (nucleotide sequence has SEQ 
ID NO. 15; amino acid sequence has SEQ ID NO. 16): 
SEQ ID NO. 15 : 



GGAT ACT TTT GAT 7A7 CCG GGG CGG GCG CAC ACA 7TT GAT GAC TTC TGC 
Thr Phe Asp Tyr ?ro Gly Arg A] a His Thr Phe Asp Asp Phe Cys 
15 10 15 

CCT GAA TGC CGC GC T^A GGC C7C CAG GG7 7G7 GCT TTC CAG TCA ACT 
Pro Glu Cys Arg Ala Leu Gly Leu Gin Gly Cys Ala Phe Gin Ser Thr 
20 25 30 

GTC GCT GAG CTC CAG CGC C"7 AAA G77 AAG G77 
Val Ala Glu Leu Gin Arg _eu Lys Val Lys Val 

35 " 40 



49 



97 



130 
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SEQ ID NO. :6 : 

Thr Phe Asp ""y 3r o j « y ^rg - " ?. "ir ? h e Asd Aso ?he Cys Pro 
1 5 10 L5 

5 

Giu Cys Arg A 1 3 ^e 1 . Sly >e'j V " j'/ Cys Ala P h e G:n Ser Thr Val 
20 15 30 

Ala Giu Leu j'n Arg L.e'j _ys '/a'! Ly5 
10 35 -0 

The universal nature of these epitopes is 
evident from the homology exhibited by the DNA that 
encodes them. If the epitope coding sequences from 
15 the Mexican strains shown above are compared to DNA 
sequences from other strains, such as the Burmese 
strain also set forth above, similarities are 
evident, as shown in the following comparisons. 
Comparison of 406.4-2 epitopes, HEV Mexico and Burma strains: 
20 10 20 30 

MEXICAN( SEQ ID NO. 17) ANQPGHLAPLGE IRPSAPPLPPVADLPQPGLRR 



BURMA( SEQ ID NO. 18) ANPPDHSAPLGVTRPSAPPLPHWDLPQLGPRR 

10 20 30 



There is 73.5% identity in a 33-amino acid overlap. 



Comparison of 4 06.3-2 epitopes , HEV Mexico and Burma strains: 
MEXICAN( SEQ ID No. 19) 
30 10 20 30 40 

TFDYPGRAHTFDDFCPECRALGLQGCAFQSTVAELQRLKVKV 



TLDYPARAHTFDDFCPECRPLGLQGCAFQSTVAELQRLKMKV 
10 20 30 40 

35 BURMA ( SEQ ID No. 20) 

There is 90.5% identity in the 42-amino acid overlap. 

It will be recognized by one skilled in the 

art of molecular genetics that each of the specific 

DNA sequences given above shows a corresponding 
40 complementary DNA sequence as well as RNA sequences 

corresponding to both the principal sequence shown and 
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the comp^ntary DNA sequence. Addit^Fally , open 

reading frames encoding peptides are present, and 
expressible peptides are disclosed by the nucleotide 
sequences without setting forth the amino acid 
sequences explicitly, in the same manner as if the 
amino acid sequences were explicitly set forth as in 
the ET1.1 sequence or other sequences above. 

DETAILED DESCRIPTION OF THE INVENTION 
I . Definitions 

The terms defined below have the following 
meaning herein: 

1. "Enterically transmitted non-A/non-B 
hepatitis viral agent, ET-NANB, or HEV" means a virus, 
virus type, or virus class which (1) causes water- 
borne, infectious hepatitis, (ii) is transmissible in 
cynomolgus monkeys, (iii) is serologically distinct 
from hepatitis A virus (HAV), hepatitis B virus (HBV), 
hepatitis C virus (HCV), and hepatitis D virus, and 
(iv) includes a genomic region which is homologous to 
the 1.33 kb cDNA insert in plasmid pTZKFl (ET1 . 1 ) 
carried in E_;_ coli strain BB4 identified by ATCC 
deposit number 6 7717. 

2. Two nucleic acid fragments are "homologous" 
if they are capable of hybridizing to one another 
under hybridization conditions described in Maniatis 
et al . , op. cit . , pp. 320-323. However, using the 
following wash conditions: 2 x SCC , 0.1% SDS, room 
temperature twice, 30 minutes each; then 2 x SCC, 0.1% 
SDS, 50°C once, 30 minutes; then 2 x SCC, room 
temperature twice, 10 minutes each, homologous 
sequences can be identified that contain at most about 
25-30% basepair mismatches. More preferably, 
homologous nucleic acid strands contain 15-25% 
basepair mismatches, even more preferably 5-15% 
basepair mismatches. These degrees of homology can be 
selected by using more stringent wash conditions for 
identification of clones from gene libraries (or 



20309587 
040591 



otlW sources of ger.e^i.c -ferial; , as is well known 
in the art . 

3. Two amine acii sequences or two nucleotide 
sequences (in an alternative definition for homology 
between two nucleotide sequences; are considered 
homologous (as this term is preferably used in this 
specification) if they have an alignment score of >5 
(in standard deviation units; using the program ALIGN 
with the mutation gap matrix and a gap penalty of 6 or 
greater. See Dayhcff, M.O., in Atlas of Protein 
Sequence and Structure ( 1 9 " 2 } Vol. 5, National 
Biomedical Research Foundation, pp. 101-110, and 
Supplement 2 to this volume, pp. 1-10. The two 
sequences (or parts thereof, preferably at least 30 
amino acids in length; are more preferably homologous 
if their amino acids are greater than or equal to 50% 
identical when optimally aligned using the ALIGN 
program mentioned above. 

4. A DNA fragment is ''derived from" an ET-NANB 
viral agent if it has the same or substantially the 
same basepair sequence as a region of the viral agent 
genome . 

5. A protein is "derived from" an ET-NANB viral 
agent if it is encoded by an open reading frame of a 
DNA or RNA fragment derived from an ET-NANB viral 
agent . 

II . Obtaining Cloned ET-NANB Fragments 

According to one aspect of the invention, it has 
been found that a virus-specific DNA clone can be 
produced by (a) isolating RNA from the bile of a 
cynomolgus monkey having a known ET-NANB infection, 
(b) cloning the c DNA fragments to form a fragment 
library, and (c) screening the library by 
differential hybridization to radiolabeled cDNAs from 
infected and non-infected bile sources. 
A. cDNA Fragment Mixture 
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ET-NANB infect:- in cyncmolgus monkeys is 
initiated by inoculating the animals intravenously 
with a 10% w/v suspension from human case stools 
positive for 2 " - 3 4 nm ET-NANB particles (mean diameter 
5 32 run) . An infected animal is monitored for elevated 

levels of alanine am i r.o t ran.s f erase , indicating 
hepatitis infection.. ET-NANB infection is confirmed by 
immunospecif ic binding o : seropositive antibodies to 
virus-like particles • , according to published 

10 methods (Graveile; . Briefly, a stool (or bile) 

specimen taken from the infected animal 3-4 weeks 
after infection is diluted 1:10 with phosphate- 
buffered saline, and the lot suspension is clarified 
by low-speed centrif ugat ion and filtration 
15 successively through 1.2 and 0.45 micron filters. The 
material may be further purified oy pelleting through 
a 30% sucrose cushion (Bradley;,. The resulting 
preparation of VLPs is mixed with diluted serum from 
human patients with known ET-NANB infection. After 
20 incubation overnight, the mixture is centrif uged 

overnight to pellet immune aggregates, and these are 
stained and examined by electron microscopy for 
antibody binding to the VLPs . 

ET-NANB infection can also be confirmed by 
25 seroconversion to VLP-positive serum. Here the serum 

of the infected animal is mixed as above with 27-34 nm 
VLPs isolated from the stool specimens of infected 
human cases and examined by immune electron microscopy 
for antibody binding tc the VLPs , 
30 Bile can be collected from ET-NAN3 positive 

animals by either cannuiating the bile duct and 
collecting the bile fluid or by draining the bile 
duct during necropsy. Total RNA is extracted from the 
bile by hot phenol extraction, as outlined in Example 
35 1A. The RNA fragments are used to synthesize 

corresponding duplex cENA fragments by random priming, 
also as referenced in Example 1A. The cDNA fragments 
may be fractionated by oel electrophoresis or density 
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gradient centrirugatior. tc obtain a desired size class 
of fragments, e.g., 500-4,000 basepair fragments. 

Although alternative sources of viral material, 
such as VLPs obtained from, stool samples (as 
5 described in Example 4 ) , ^ay be used for producing a 

CDNA fraction, the bile source is preferred. According 
to one aspect of the invention, it has been found that 
bile from ET-NANB- infected monkeys shows a greater 
number of intact viral particles than material 
10 obtained from stool samples, as evidenced by immune 
electron microscopy. Bile obtained from an ET-NANB 
infected human or cyr.cmcl jus macaque, for use as a 
source of ET-NANB viral protein or genomic material, 
or intact virus, forms part of the present invention. 

15 

B. cDNA Library and Screening 

The cDNA fragments from above are cloned into a 
suitable cloning vector to form a cDNA library. This 
may be done by equipping blunt-ended fragments with a 

20 suitable end linker, such as an EcoRI sequence, and 

inserting the fragments into a suitable insertion site 
of a cloning vector, such as at a unique EcoRI site. 
After initial cloning, the library may be re-cloned, 
if desired, to increase the percentage of vectors 

25 containing a fragment insert. The library construction 
described in Example IB is illustrative. Here cDNA 
fragments were blunt-ended, equipped with EcoRI ends, 
and inserted into the EcoRI site of the lambda phage 
vector gtlO. The library phage, which showed less than 

30 5% fragment inserts, was isolated, and the fragment 
inserts re-cloned into the lambda gtlO vector, 
yielding more than 95* insert-containing phage. 

The cDNA library is screened for sequences 
specific for ET-NANB by differential hybridization to 

35 cDNA probes derived from infected and non-infected 

sources. cDNA fragments from infected and non-infected 
source bile or stool viral isolates can be prepared as 
above. Radio labe 1 ing the fragments is by random 
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labeling, nick transia:::r., or era Labeling, 
according to conventional me-oncris (Maniatis, p. 109). 
The cDNA library from abo'-e is screened by transfer to 
duplicate nitrocellulose filters, and hybridization 
5 with both infected-source and non- infected-source 

(control) radiolabeled probes, as detailed in Example 
2. In order to recover sequences that hybridize at the 
preferred outer limit of 2 5-30% basepair mismatches, 
clones can be selected if they hybridize under the 

10 conditions described in y.aniatis et al . , op . cit . , pp. 
320-323, but using the following wash conditions: 2 x 
SCC, 0.1% SDS, room temperature - twice, 30 minutes 
each; then 2 x SCC, 0.1% SCS, 50°C - once, 30 minutes; 
then 2 x SCC, room temperature - twice, 10 minutes 

15 each. These conditions allowed identification of the 
Mexican isolate discussed above using the ET1.1 
sequence as a probe. Plaques which show selective 
hybridization to the infected-source probes are 
preferably re-plated at lew plating density and re- 

20 screened as above, to isolate single clones which are 
specific for ET-NANB sequences. As indicated in 
Example 2, sixteen clones which hybridized 
specifically with infected-source probes were 
identified by these procedures. One of the clones, 

25 designated lambda gtlOl.l, contained a 1.33 kilobase 
fragment insert. 

C. ET-NANB Sequences 

The basepair sequence of cloned regions of the 

30 ET-NANB fragments from Part 3 are determined by 
standard sequencing methods . in one illustrative 
method, described in Example 3, the fragment insert 
from the selected cloning vector is excised, isolated 
by gel electrophoresis, and inserted into a cloning 

35 vector whose basepair sequence on either side of the 
insertion site is known. The particular vector 
employed in Example 3 is a pTZKFl vector shown at the 
left in Figure 1. The ET - MAN 3 fragment from the gtlO- 
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1.1 phage was inserted at the unique EcoRI site of the 
pTZKFl plasmid. Recombinants carrying the desired 
insert were identified by hybridization with the 
isolated 1.33 kiiocase fragment, as described in 
Example 3. One selected plasmid, identified as pTZKFl 
(ET1.1), gave the expected 1.33 kb fragment after 
vector digestion with Ecc^I. E. coli strain BB4 
infected with the pTZKFl ET1 . 1 ) plasmid has been 
deposited with the American Type Culture Collection, 
Rockville, MD , and is identified by ATCC deposit 
number 6 7717 . 

The pTZKFl ( ETi. 1) plasmid is illustrated at the 
bottom in Figure 1. The fragment insert has 5' and 3' 
end regions denoted at A and C, respectively, and an 
intermediate region, denoted at B. The sequences in 
these regions were determined by standard dideoxy 
sequencing and were set forth in an earlier 
application in this series. The three short sequences 
(A, B, and C) are from the same insert strand. As will 
be seen in Example 3, the B-region sequence was 
actually determined from the opposite strand, so that 
the B region sequence shown above represents the 
complement of the sequence in the sequenced strand. 
The base numbers of the partial sequences are 
approximate . 

Later work in the laboratory of the inventors 
identified the full sequence, set forth above. 
Fragments of this total sequence can readily be 
prepared using restriction endonucleases . Computer 
analysis of both the forward and reverse sequence has 
identified a number of cleavage sites. 

III. ET-NANB Fragments 

According to another aspect, the invention 
includes ET-NANB- s pec i f i c fragments or probes which 
hybridize with ET-NANB genomic sequences or cDNA 
fragments derived therefrom. The fragments may include 
full-length cDNA fragments such as described in 
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Section II, or may be :er:vec from shorter sequence 
regions within cloned cENA fragments. Shorter 
fragments can be prepare.: by enzymatic digestion of 
full-length fragments ::n::er conditions which yield 
5 desired-sized fragments, i 5 will be described in 

Section IV. Alternatively, the fragments can be 
produced by oligonucleotide synthetic methods, using 
sequences derived from o'^e cENA fragments. Methods or 
commercial services for producing selected-sequence 

10 oligonucleotide fragments are available. Fragments 
are usually at least 12 nucleotides in length, 
preferably at ie^st 14, 2", 20 or 50 nucleotides, when 
used as probes. Probes can be full length or less 
than 500, preferably less than 300 or 200, nucleotides 

15 in length. 

To confirm that a given ET-NANB fragment is 
in fact derived from the ET-NANB viral agent, the 
fragment can be shown to hybridize selectively with 
cDNA from infected sources. By way of illustration, to 

20 confirm that the 1.33 kb fragment in the pTZKFl (ET1 . 1) 
plasmid is ET-NANB in origin, the fragment was excised 
from the pTZKFl ( ET1 . 1 ) plasmid, purified, and 
radiolabeled by random labeling. The radiolabeled 
fragment was hybridized with fractionated cDNAs from 

25 infected and non-infected sources to confirm that the 

probe reacts only with infected-source cDNAs . This 
method is illustrated in Example 4, where the above 
radiolabeled 1.33 kb fragment from pTZKFl ( ET1 . 1 ) 
plasmid was examined for binding to cDNAs prepared 

30 from infected and ncn-infected sources. The infected 
sources are (1) bile from a zynomoigus macaque 
infected with a strain of virus derived from stool 
samples from human patients from Burma with known ET- 
NANB infections and (2) a viral agent derived from the 

35 stool sample of a human ET - NAN 3 patient from Mexico. 

The cDNAs in each fragment mixture were first 
amplified by a linker cri~er amplification method 
described in Example 4 . Fragment separation was on 
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10 



15 



20 



25 



35 



e •-^••o-> -h^n blotting and then 

agarose gei, tc:.^->-^ ■ / --- - -- 

■ ~-^n ^ b — ~ radiolabeled 1.33 kb 

hybridization to 



: ra:c:ona:ed cCNAs. The lane 



fragment to t.. 

. . . M ,„ Cr .„_ -p 0 infected sources showed a 
containing cDN« 

. „ = v-nu-i orobe, as expected ( cDNAs 

smeared Da.iu — 



amplified by the linker/primer amplification method 
would be expected to have a broad range of sizes). . No 
probe binding tc the amplified cDNAs from the non- 

^ = n-^-v<=d. The results indicate 
infected sources c~ 

^ i ■»■» kb -obe is soecific for cDNA fragments 
that the 1.33 ko p.o^e t 

associated with ET-NANB infection. This same type of 
study, using ET 1.1 as the probe, has demonstrated 
hybridization to ET-NANB samples collected from , 
Tashkent, Somalia, Borneo and Pakistan. Secondly, the 
fact that the probe is specific for ET-NANB related 
sequences derived from different continents (Asia, 
Africa and North America, indicates the cloned ET-NANB 

i ^ ; s derived from a common ET- 
Burma sequence (E.l.i) is ami 

viruq class responsible for ET-NANB 
NANB virus or virus ciass 

hepatitis infection worldwide. 

In a related confirmatory study, probe 
binding to fractionated genomic fragments prepared 
from human or cynomolgus macaque genomic DNA (both 
infected and uninfected) was examined. No probe 
binding was observed to either genomic fraction, 
demonstrating that the ET-NANB fragment is not an 
endogenous human or cynomolgus genomic fragment and 
additionally demonstrating that HEV is an RNA virus. 

Another confirmation of ET-NANB specific 

• ,v, 0 f„r.^- 5 ^ the ability to express 
30 sequences in the f rag.u-=.. s 

. .-^Hmc T-eaions in the fragments 

ET-NANB proteins rrom _oamg 

and to demonstrated specific sero-reac tivity of these 
proteins with sera collected during documented 
outbreaks of ET-NANB . Section IV below discusses 
methods of protein expression using the fragments. 

One important use of the ET-NANB -specific 

* is fo- ^d-ntifving ET-NANB-derived cDNAs 

fragments is re . ? 

^^a;-:--,: sequence information. The 
which contain addi^^.a- — 
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newly identified cDNAs, in turn, yield new fragment 
probes, allowing further iterations until the entire 
viral genome is identified and sequenced. Procedures 
for identifying additional ET-NANB library clones and 
5 generating new probes therefrom generally follow the 
cloning and selection procedures described in Section 
II. 

The fragments (and oligonucleotides prepared 
based on the sequences given above) are also useful as 

10 primers for a polymerase chain reaction method of 

detecting ET-NANB viral genomic material in a patient 
sample. This diagnostic method will be described in 
Section V below. 

Two specific genetic sequences derived from 

15 the Mexican strain, identified herein as 406.3-2 and 
406.4-2, have been identified that encode immunogenic 
epitopes. This was done by isolating clones which 
encode epitopes that immunologically react 
specifically with sera from individuals and 

20 experimental animals infected with HEV. Comparison of 
the isolated sequences with those in the Genebank 
collection of genetic sequences indicate that these 
viral sequences are novel. Since these sequences are 
unique, they can be used to identify the presence of 

25 HEV and to distinguish this strain of hepatitis from 
HAV, HBV, and HCV strains. The sequences are also 
useful for the design of oligonucleotide probes to 
diagnose the presence of virus in samples . They can 
be used for the synthesis of polypeptides that 

30 themselves are used in immunoassays. The specific 

406.3-2 and 406.4-2 sequences can be incorporated into 
other genetic material, such as vectors, for ease of 
expression or replication. They can also be used (as 
demonstrated above) for identifying similar antigenic 

35 regions encoded by related viral strains, such as the 
Burmese strain. 

IV. ET-NANB Proteins 
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As indicated above, ET-NANB proteins can be 
prepared by expressing cpen reading- frame coding 
regions in ET-NANB fragments . In one preferred 
approach, the ET-NANB fragments used for protein 
expression are derived from cloned cDNAs which have 
been treated tc produce desired-size fragments, and 
preferably random fragments with sizes predominantly 
between about 100 to about 300 base pairs. Example 5 
describes the preparation of such fragments by DNAs 
digestion. Because it is desired to obtain peptide 
antigens of between about 30 to about 100 amino acids, 
the digest fragments are preferably size 
fractionated, for example by gel electrophoresis, to 
select those in the approximately 100-300 basepair 
size range. Alternatively, cDNA libraries constructed 
directly from HEV-containing sources (e.g., bile or 
stool) can be screened directly if cloned into an 
appropriate expression vector (see below). 

For example, the ET-NANB proteins expressed 
by the 406.3-2 and 406.4-2 sequences (and peptide 
fragments thereof) are particularly preferred since 
these proteins have been demonstrated to be 
immunoreactive with a variety of different human sera, 
thereby indicating the presence of one or more 
epitopes specific for HEV on their surfaces. These 
clones were identified by direct screening of a gtll 
library. 

A. Expression Vector 

The ET-NANB fragments are inserted into a 
suitable expression vector. One exemplary expression 
vector is lambda gtll, which contains a unique EcoRI 
insertion site 53 base pairs upstream of the 
translation termination codon of the beta- 
galactosidase gene. Thus, the inserted sequence will 
be expressed as a be ta-ga lac tos idase fusion protein 
which contains the N-terminai portion of the beta- 
galactos idase gene, the heterologous peptide, and 
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optionally the C-terminai region of the beta- 
galactosidase peptide ;the C-terminai portion being 
expressed when the heterologous peptide coding 
sequence does net contain a translation termination 
5 codon) . This vector also produces a temperature- 
sensitive repressor (clSST) which causes viral 
lysogeny at permissive temperatures, e.g., 32°C / and 
leads to viral lysis at elevated temperatures, e.g., 
37°C. Advantages of this vector include: (1) highly 

10 efficient recombinant generation, (2) ability to 

select lysogenized host cells on the basis of host- 
cell growth at permissive, but not ncn-permissive, 
temperatures, and (3) high levels of recombinant 
fusion protein production. Further, since phage 

15 containing a heterologous insert produces an inactive 
beta-galactos idase enzyme, phage with inserts can be 
readily identified by a beta-galactos idase colored- 
substrate reaction . 

For insertion into the expression vector, 

20 the viral digest fragments may be modified, if needed, 
to contain selected restriction-site linkers, such as 
EcoRI linkers, according to conventional procedures. 
Example 1 illustrates methods for cloning the digest 
fragments into lambda gtil, which includes the steps 

25 of blunt-ending the fragments, ligating with EcoRI 

linkers, and introducing the fragments into EcoRI -cut 
lambda gtll. The resulting viral genomic library may- 
be checked to confirm that a relatively large 
(representative) library has been produced. This can 

30 be done, in the case of the lambda gtll vector, by 
infecting a suitable bacterial host, plating the 
bacteria, and examining the plaques for loss of beta- 
galactosidase activity. Using the procedures described 
in Example 1, about 50% of the plaques showed loss of 

35 enzyme activity. 



B . Peptide Antigen Expression 
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The viral generic library formed above is 
screened for production of peptide antigen (expressed 
as a fusion protein) which is immunoreactive with 
antiserum from ST- MAN B seropositive individuals . In 
5 a preferred screening method, host ceils infected with 
phage library vectors are plated, as above, and the 
plate is blotted with a nitrocellulose filter to 
transfer recombinant protein antigens produced by the 
cells onto the filter. The filter is then reacted with 
10 the ET-NANB antiserum, washed to remove unbound 

antibody, and reacted with reporter- labeled, anti- 
human antibody, which becomes bound to the filter, in 
sandwich fashion, through the ant i- ET-NANB antibody. 

Typically phage plaques which are identified 
15 by virtue of their production of recombinant antigen 
of interest are re-examined at a relatively low 
density for production of antibody-reactive fusion 
protein. Several recombinant phage clones which 
produced immunoreactive recombinant antigen were 
20 identified in the procedure. 

The selected expression vectors may be used 
for scale-up production, for purposes of recombinant 
protein purification. Scale-up production is carried 
out using one of a variety of reported methods for (a) 
25 lysogenizing a suitable host, such as E_;_ coll , with a 
selected lambda gtll recombinant (b) culturing the 
transduced cells under conditions that yield high 
levels of the heterologous peptide, and (c) purifying 
the recombinant antigen from the lysed cells. 
30 In one preferred method involving the above 

lambda gtll cloning vector, a high-producer E_^_ coli 
host, BNN103, is infected with the selected library 
phage and replica plated on two plates. One of the 
plates is grown at 32°C, at which viral lysogeny can 
35 occur, and the other at 42°C, at which the infecting 
phage is in a lytic stage and therefore prevents cell 
growth. Cells which grow at the lower but not the 



20309587 
040591 




higher temperature are therefore assumed to be 
success fully lysogenized. 

The lyscgenizeci host cells are then grown 
under liquid culture conditions vhich favor high 
5 production of the fused protein containing the viral 
insert, anci lysed by rapid freezing to release the 
desired fusion protein. 

C . Peptide Pur i f ication 

10 The recombinant peptide can be purified by 

standard protein purification procedures which may 
include differential precipitation, molecular sieve 
chromatography , ion -exchange chromatography, 
isoelectric focusing, gel electrophoresis and 

15 affinity chromatography. In the case of a fused 

protein, such as the beta-galactosidase fused protein 
prepared as above, the protein isolation techniques 
which are used can be adapted from those used in 
isolation of the native protein. Thus, for isolation 

20 of a soluble betagalactosidase fusion protein, the 
protein can be isolated readily by simple affinity 
chromatography, by passing the cell lysis material 
over a solid support having surface-bound anti-beta- 
galactosidase antibody . 

25 

D. Viral Proteins 

The ET-NANB protein of the invention may 
also be derived directly from the ET-NANB viral agent. 
VLPs or protein isolated from stool or liver samples 

30 from an infected individual, as above, are one 

suitable source of viral protein material. The VLPs 
isolated from the stool sample may be further purified 
by affinity chromatography prior to protein isolation 
(see below). The viral agent may also be raised in 

35 cell culture, which provides a convenient and 

potentially concentrated source of viral protein. Co- 
owned U.S. Patent Application Serial No. 846,757, 
filed April 1, 19S5, describes an immortalized trioma 
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liver cell which supports NANB infection in cell 
culture. The tnoma cell line is prepared by fusing 
human liver cells with a mouse/human fusion partner 
selected for human chromosome stability. Cells 
containing the desired NANB viral agent can be 
identified by immunofluorescence methods, employing 
anti-ET-NANB human antibodies . 

The viral agent is disrupted, prior to 
protein isolation, by conventional methods, which can 
include sonication, high- or low-salt conditions, or 
use of detergents . 

Purification of ET-NANB viral protein can be 
carried out by affinity chromatography, using a 
purified anti-ET-NANB antibody attached according to 
standard methods to a suitable solid support. The 
antibody itself may be purified by affinity 
chromatography, where an immunoreac tive recombinant 
ETNANB protein, such as described above, is attached 
to a solid support, for isolation of anti-ET-NANB 
antibodies from an immune serum source. The bound 
antibody is released from the support by standard 
methods . 

Alternatively, the anti-ET-NANB antibody may 
be an antiserum or a monoclonal antibody (Mab) 
prepared by immunizing a mouse or other animal with 
recombinant ETNANB protein. For Mab production, 
lymphocytes are isolated from the animal and 
immortalized with a suitable fusion partner, and 
successful fusion products which react with the 
recombinant protein immunogen are selected. These in 
turn may be used in affinity purification procedures, 
described above, to obtain native ET-NANB antigen. 



V. Utility 

Although ET-NANB is primarily of interest 
because of its effects on humans, recent data has 
shown that this virus is also capable of infecting 
other animals, especially mammals. Accordingly, any 
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discussion herein of utility applies to both human and 
veterinary uses, especially commercial veterinary 
uses, such as the diagnosis and treatment of pigs, 
cattle, sheep, horses, and other domesticated animals. 
5 A. Diagnostic Methods 

The particles and antigens of the invention, 
as well as the genetic material, can be used in 
diagnostic assays. Methods for detecting the presence 
of ET-NANB hepatitis comprise analyzing a biological 

10 sample such as a blood sample, stool sample or liver 
biopsy specimen for the presence of an analyte 
associated with ET-NAN3 hepatitis virus. 

The analyte can be a nucleotide sequence 
which hybridizes with a probe comprising a sequence of 

15 at least about 16 consecutive nucleotides, usually 30 
to 200 nucleotides, up to substantially the full 
sequence of the sequences shown above (cDNA 
sequences). The analyte can be RNA or cDNA. The 
analyte is typically a virus particle suspected of 

20 being ET-NANB or a particle for which this 

classification is being ruled out. The virus particle 
can be further characterized as having an RNA viral 
genome comprising a sequence at least about 70% 
homologous to a sequence of at least 12 consecutive 

25 nucleotides of the "forward" and "reverse" sequences 
given above, usually at least about 80% homologous to 
at least about 60 consecutive nucleotides within the 
sequences, and may comprise a sequence substantially 
homologous to the full-length sequences. In order to 

30 detect an analyte, where the analyte hybridizes to a 
probe, the probe may contain a detectable label. 
Particularly preferred for use as a probe are 
sequences of consecutive nucleotides derived from the 
406.3-2 and 406.4-2 clones described herein, since 

35 these clones appear to be particularly diagnostic for 
HEV. 

The analyte can also comprise an antibody 
which recognizes an antigen, such as a cell surface 
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antigen, on a ET-NANB virus particle. The analyte can 
also be a ST -NAM 3 viral antigen. Where the analyte is 
an antibody or an antigen, either a labelled antigen 
or antibody, respectively, can be used to bind to the 
5 analyte to fern, an immunological complex, which can 
then be detected by means of the Label. 

Typically, methods for detecting anaiytes 
such as surface antigens and/or whole particles are 
based on immunoassays. Immunoassays can be conducted 

10 either to determine the presence of antibodies in the 
host that have arisen from infection by ET-NANB 
hepatitis virus or by assays that directly determine 
the presence of virus particles or antigens. Such 
techniques are well known and need not be described 

15 here in detail. Examples include both heterogeneous 
and homogeneous immunoassay techniques. Both 
techniques are based on the formation of an 
immunological complex between the virus particle or 
its antigen and a corresponding specific antibody. 

20 Heterogeneous assays for viral antigens typically use 
a specific monoclonal or polyclonal antibody bound to 
a solid surface. Sandwich assays are becoming 
increasingly popular. Homogeneous assays, which are 
carried out in solution without the presence of a 

25 solid phase, can also be used, for example by 

determining the difference in enzyme activity brought 
on by binding of free antibody to an enzyme-antigen 
conjugate. A number of suitable assays are disclosed 
in U.S. Patent Nos . 3,817,337, 4,005,360, 3,996,345. 

30 When assaying for the presence of antibodies 

induced by ET-NANB viruses, the viruses and antigens 
of the invention can be used as specific binding 
agents to detect either IgG or IgM antibodies. Since 
IgM antibodies are typically the first antibodies that 

35 appear during the course of an infection, when IgG 
synthesis may not yet have been initiated, 
specifically distinguishing between IgM and IgG 
antibodies present in the blood stream of a host will 
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enable a physician or other investigator to determine 
whether the infection is recent or convalescent. 
Proteins expressed by the 406.3-2 and 406.4-2 clones 
described herein and peptide fragments thereof are 
5 particularly preferred for use as specific binding 
agents to detect antibodies since they have been 
demonstrated to be reactive with a number of different 
human HEV sera. Further, they are reactive with both 
acute and convalescent sera. 

10 In one diagnostic configuration, test serum 

is reacted with a solid phase reagent having surface- 
bound ET-NANB protein antigen. After binding anti-ET- 
NANB antibody to the reagent and removing unbound 
serum components by washing, the reagent is reacted 

15 with reporter- labeled anti-human antibody to bind 

reporter to the reagent in proportion to the amount of 
bound ant i- ET-NANB antibody on the solid support. The 
reagent is again washed to remove unbound labeled 
antibody, and the amount of reporter associated with 

20 the reagent is determined. Typically, the reporter is 
an enzyme which is detected by incubating the solid 
phase in the presence of a suitable fluorometric or 
colorimetric substrate . 

The solid surface reagent in the above assay 

25 prepared by known techniques for attaching protein 

material to solid support material, such as polymeric 
beads, dip sticks, or filter material. These 
attachment methods generally include non-specific 
adsorption of the protein to the support or covalent 

30 attachment of the protein, typically through a free 
amine group, to a chemically reactive group on the 
solid support, such as an activate carboxyl, hydroxyl, 
or aldehyde group. 

In a second diagnostic configuration, known 

35 as a homogeneous assay, antibody binding to a solid 
support produces some change in the reaction medium 
which can be directly detected in the medium. Known 
general types of homogeneous assays proposed 
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heretofore include ; a; s n i n - 1 a be led reporters, where 
antibody binding tc the antigen is detected by a 
change in reported mobility broadening of the spin 
splitting peaks), f b) fluorescent reporters, where 
5 binding is detected by a change in fluorescence 
efficiency, fcj enzyme reverters , where antibody 
binding effects enzyme/ subs trate interactions, and (d) 
liposome-bound reporters, where binding leads to 
liposome lysis and release of encapsulated reporter. 

10 The adaptation of these tet:v:ds to the protein antigen 
of the present invention follows conventional methods 
for preparing homogeneous assay reagents. 

In each of the assays described above, the 
assay method involves reacting the serum from a test 

15 individual with the protein antigen and examining the 
antigen for the presence of bound antibody. The 
examining may involve attaching a labeled anti-human 
antibody to the antibody being examined, either IgM 
(acute phase) or IgG (convalescent phase), and 

20 measuring the amount of reporter bound to the solid 
support, as in the first method, or may involve 
observing the effect of antibody binding on a 
homogeneous assay reagent, as in the second method. 

Also forming part of the invention is an 

25 assay system or kit for carrying out the assay method 
just described. The kit generally includes a support 
with surface-bound recombinant protein antigen which 
is (a) immunoreact ive with antibodies present in 
individuals infected with enterically transmitted 

30 nonA/nonB viral agent and ;b; derived from a viral 

hepatitis agent whose genome contains a region which 
is homologous to the 1.3 3 kb DNA EcoRI insert present 
in plasmid pTZKFl ( ETi . 1 ) carried in Coli strain 
BB4 , and having ATCC deposit no. 577 17. A reporter- 

35 labeled anti-human antibody in the kit is used for 
detecting surface-bound an t i -ET-NANB antibody. 
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B . Viral Genome C ia gnostic Ape 1 i:at ions 

The genetic material of the invention can 
itself be used in numerous assays as probes for 
genetic material present in naturally occurring 
5 infections. One method for amplification of target 
nucleic acids , for later analysis by hybridization 
assays, is known as the polymerase chain reaction or 
PCR technique. The PGR technique can be applied to 
detecting virus particles of the invention in 

10 suspected pathological samples using oligonucleotide 
primers spaced apart from each other and based on the 
genetic sequence set forth above. The primers are 
complementary to opposite strands of a double stranded 
DNA molecule and are typically separated by from about 

15 50 to 450 nt or more (usually not more than 2000 nt). 
This method entails preparing the specific 
oligonucleotide primers and then repeated cycles of 
target DNA dena tura t ion , primer binding, and 
extension with a DNA polymerase to obtain DNA 

20 fragments of the expected length based on the primer 
spacing. Extension products generated from one primer 
serve as additional target sequences for the other 
primer. The degree of amplification of a target 
sequence is controlled by the number of cycles that 

25 are performed and is theoretically calculated by the 
simple formula 2n where n is the number of cycles. 
Given that the average efficiency per cycle ranges 
from about 65% to 85%, 25 cycles produce from 0.3 to 
4.8 million copies of the target sequence. The PCR 

30 method is described in a number of publications, 

including Saiki et al., Science (1985) 230:1350-1354; 
Saiki et al . , Nature ( 19^5) 324:163-166; and Scharf et 
al., Science (1936; 2 33 : 10"6- 1078 . Also see U.S. 
Patent Nos. 4,683,194; 4,683,195; and 4,683,202. 

35 The invention includes a specific diagnostic 

method for determination of ET-NANB viral agent, based 
on selective amplification of ET-NANB fragments. This 
method employs a pair of single-strand primers derived 



20309587 
040591 



54 




from non-homologous regions of cccosite strands of a 
DNA duplex fragment, wh;:: :r. turn is derived from an 
enterically transmitted viral hepatitis agent whose 
genome contains a region -vhich is homologous to the 
5 1.33 kb DNA EcoRI insert oresent in plasmid 

pTZKFl ( ETi . 1 ) carried in cell strain BB4 , and 
having ATCC deposit no. "~i". These "primer 
fragments," which form one aspect of the invention, 
are prepared from ET-NAN" fragments such as described 
10 in Section III above. The method follows the process 
for amplifying selected nucleic acid sequences as 
disclosed in U.S. Patent No. 4,633,202, as discussed 
above . 

15 C. Peptide Vacc ine 

Any of the antigens of the invention can be 
used in preparation of a vaccine. A preferred starting 
material for preparation of 3 vaccine is the particle 
antigen isolated from bile. The antigens are 

20 preferably initially recovered as intact particles as 
described above. However, it is also possible to pre- 
pare a suitable vaccine from particles isolated from 
other sources or non-particle recombinant antigens. 
When non-particle antigens are used (typically soluble 

25 antigens), proteins derived from the viral envelope or 
viral capsid are preferred for use in preparing vac- 
cines. These proteins can be purified by affinity 
chromatography, also described above. 

If the purified protein is not immunogenic 

30 per se, it can be bound to a carrier to make the 

protein immunogenic. Carriers include bovine serum 
albumin, keyhole limpet hemocyanin and the like. It is 
desirable, but not necessary, to purify antigens to be 
substantially free of human protein. However, it is 

35 more important that the antigens be free of proteins, 
viruses, and other subst^.nres not of human origin 
that may have been intrc;:ucec by way of, or 
contamination of, the nutrient medium, cell lines, 
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s, or pathological fluids from which the virus 




tis 



is cultured or obtained. 

Vaccination can be conducted in conventional 
fashion. For example, the antigen, whether a viral 
particle or a protein, can be used in a suitable 
diluent such as water, saline, buffered salines, 
complete or incomplete adjuvants, and the like. The 
immunogen is administered using standard techniques 
for antibody induction, such as by subcutaneous 
administration of physiologically compatible, sterile 
solutions containing inactivated or attenuated virus 
particles or antigens . An immune response producing 
amount of virus particles is typically administered 
per vaccinizing injection, typically in a volume of 
one milliliter or less. 



includes, in a pharmacologically acceptable adjuvant, 
a recombinant protein or protein mixture derived from 
an enterically transmitted nonA/nonB viral hepatitis 
agent whose genome contains a region which is 
homologous to the 1.3 3 kb DNA EcoRl insert present in 
plasmid pTZKF 1 ( ET1 . 1 ) carried in E^ coli strain BB4 , 
and having ATCC deposit no. 6 77 17. The vaccine is 
administered at periodic intervals until a significant 
titer of ant i-ET-NANB antibody is detected in the 
serum. The vaccine is intended to protect against ET- 
NANB infection. 



using proteins expressed by the 406.3-2 and 406.4-2 
clones described herein and equivalents thereof, 
including fragments of the expressed proteins. Since 
these clones have already been demonstrated to be 
reactive with a variety of human HEV-positive sera, 
their utility in protecting against a variety of HEV 
strains is indicated. 

D . Prophylactic and Therapeutic 
Antibodies and Antisera 



A specific example of a vaccine composition 



Particularly preferred are vaccines prepared 
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compositions can be osed *: ~ prepare antibodies to ET- 
NANB virus particles. The a:::::ca:es car. be used 
directly as antiviral acorns. Tc prepare antibodies, a 
5 host animal is immunized using the virus particles or, 
as appropriate, ncn-carticie antigens native to the 
virus particle are bound to a carrier as described 
above for vaccines. The host serum or plasma is 
collected following an appropriate time interval to 

10 provide a composition comprising antibodies reactive 
with the virus particle. The gamma globulin fraction 
or the IgG antibodies can ho obtained, for example, by- 
use of saturated ammonium sulfate or DEAE Sephadex, or 
other techniques known to those skilled in the art. 

15 The antibodies are substantially free of many of the 
adverse side effects which may be associated with 
other anti-viral agents such as drugs. 

The antibody compositions can be made even 
more compatible with the host system by minimizing 

20 potential adverse immune system responses. This is 
accomplished by removing all or a portion of the FC 
portion of a foreign species antibody or using an 
antibody of the same species as the host animal, for 
example, the use of antibodies from human/human 

25 hybridomas . 

The antibodies can also be used as a means 
of enhancing the immune response since antibody-virus 
complexes are recognized by macrophages. The anti- 
bodies can be administered in amounts similar to those 

30 used for other therapeutic administrations of anti- 
body. For example, peeled gamma globulin is admini- 
stered at 0.0 2-0.1 ml /lb oody weight during the early 
incubation of other viral diseases such as rabies, 
measles and hepatitis 3 to interfere with viral entry 

35 into cells. Thus, antibodies reactive with the ET-NANB 
virus particle can be passively administered alone or 
in conjunction with another anti-viral agent to a host 
infected with an ET-NANB virus to enhance the immune 
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res^Wnse and/or the effectiveness of an antiviral 
drug . 

Ai terna t ively , a n t i - ET- NAN B- virus antibodies 
can be induced by administering a nt i - idio type anti- 
bodies as immunegens . Conveniently, a purified anti-- 
ET-NANB-virus ar.tihody preparation prepared as de- 
scribed above is :ised to induce a nt i - idio type antibody 
in a host animal. The composition is administered to 
the host animal in a suitable diluent. Following 
administration , usually repeated administration, the 
host produces anti- idiotype antibody. To eliminate an 
immunogenic response to the Fc region, antibodies pro- 
duced by the same species as the host animal can be 
used or the Fc region of the administered antibodies 
can be removed. Following induction of ant i-idiotype 
antibody in the host animal, serum or plasma is 
removed to provide an antibody composition. The 
composition can be purified as described above for 
anti-ET-NANB virus antibodies, or by affinity 
chromatography us ing ant i-ET -NAMB- virus antibodies 
bound to the affinity matrix. The ant i-idiotype 
antibodies produced are similar in conformation to the 
authentic ET-NANB antigen and may be used to prepare 
an ET-NANB vaccine rather than using a ET-NANB 
particle antigen. 

When used as a means of inducing anti-ET- 
NANB virus antibodies in a patient, the manner of 
injecting the antibody is the same as for vaccination 
purposes , namely intramuscularly, intraperitoneally , 
subcutaneous ly or the like in an effective 
concentration in a physiologically suitable diluent 
with or without adjuvant. One or more booster 
injections may be desirable. The ant i- idiotype method 
of induction of anti -ET-NANB virus antibodies can 
alleviate problems which may be caused by passive 
administration of anti-ET-NANB-virus antibodies, such 
as an adverse immune response, and those associated 
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wit^Padminis tra:ion of cur i f led bicod components, such 
as infection with as yet undiscovered viruses. 

The ET-NANS derived proteins of the invention are 
also intended for use in producing antiserum designed 
5 for pre- or post-exposure prophylaxis. Here an ET-NANB 
protein, or mixture of proteins is formulated with a 
suitable adjuvant and administered oy injection to 
human volunteers, according to known methods for 
producing human antisera. Antibody resoonse to the 
10 injected proteins is monitored, during a several- week 
period following immunization, by periodic serum 
sampling to detect the presence an ant i- ET-NANB serum 
antibodies, as described in Section IIA above. 

The antiserum frcm immunized individuals may 
15 be administered as a preexposure prophylactic measure 
for individuals who are at risk of contracting 
infection. The antiserum is also useful in treating an 
individual post-exposure, analogous to the use of high 
titer antiserum against hepatitis B virus for post- 
20 exposure prophylaxis . 

E . Monoclonal Antibodies 

For both in vivo use of antibodies to ET- 
NANB virus particles and proteins and anti-idiotype 

25 antibodies and diagnostic use, it may be preferable to 
use monoclonal antibodies. Monoclonal anti-virus 
particle antibodies or anti- idle type antibodies can be 
produced as follows. The spleen or lymphocytes from an 
immunized animal are removed and immortalized or used 

30 to prepare hybridomas by methods known to those 
skilled in the art . To produce a human-human 
hybridoma, a human lymphocyte donor is selected. A 
donor known to be infected with a ET-NANB virus (where 
infection has been shown for example by the presence 

35 of anti-virus antibodies in the blood or by virus 
culture) may serve as a suitable lymphocyte donor. 
Lymphocytes can be isolated from a peripheral blood 
sample or spleen ceils may be used if the donor is 
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lB^cct to splenectomy. Epstein-Barr 



sunset to splenectomy. Epstein-Barr virus (EBV) can 
be used to immortalize human lymphocytes or a human 
fusion partner can be used to produce human-human 
hybridomas . Primary in vitro immunization with 
5 peptides can also be used in the generation of human 
monoclonal antibodies . 

Antibodies secreted by the immortalized 
cells are screened to determine the clones that 
secrete antibodies of the desired specificity. For 
10 monoclonal anti-virus particle antibodies, the 

antibodies must bind to ET-NANB virus particles. For 
monoclonal anti-idiotype antibodies, the antibodies 
must bind to anti-virus particle antibodies. Cells 
producing antibodies of the desired specificity are 
15 selected. 

The following examples illustrate various 
aspects of the invention, but are in no way intended 
to limit the scope thereof. 

20 Material 

The materials used in the following Examples 
were as follows: 

Enzymes: DNAse I and alkaline phosphatase 
were obtained from Boehringer Mannheim Biochemicals 

25 (BMB, Indianapolis, IN); EcoRI, Eco RI methylase, DNA 

ligase, and DNA Polymerase I, from New England Biolabs 
(NEB, Beverly MA) ; and RNase A was obtained from Sigma 
(St, Louis, MO) . 

Other reagents : EcoRI linkers were obtained 

30 from NEB; and nitro blue tetrazolium (NBT) , S-bromo-4- 
chloro-3-indolyl phosphate ( BCIP ) S-bromo-4-chloro-3- 
indolyl-B-D-galactopyranoside (Xgal) and isopropyl B- 
D-thiogalactopyranoside (IPTG) were obtained from 
Sigma . 

35 cDNA synthesis kit and random priming 

labeling kits are available from Boehringer-Mannheim 
Biochemical (BMB, Indianapolis, IN). 
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Example 1. 
Preparing cDNA Library 

A. Source of ET-NANB virus 

Two cynomolgus monkeys (cynos) were 
intravenously injected with a 10% suspension of a 
stool pool obtained from a second-passage cyno (cyno 
#37) infected with a strain of ET-NANB virus isolated 
from Burma cases whose stools were positive for ET- 
NANB, as evidenced by binding of 27-34 nm virus-like 
particles (VLPs) in the stool to immune serum from a 
known ETNANB patient. The animals developed elevated 
levels of alanine aminotransferase (ALT) between 24-36 
days after inoculation, and one excreted 27-34 nm 
VLPs in its bile in the pre-acute phase of infection. 

The bile duct of each infected animal was 
cannulated and about 1-3 cc of bile was collected 
daily. RNA was extracted from one bile specimen (cyno 
#121) by hot phenol extraction, using a standard RNA 
isolation procedure. Double-strand cDNA was formed 
from the isolated RNA by a random primer for first- 
strand generation, using a cDNA synthesis kit obtained 
from Boehringer-Mannheim (Indianapolis, IN). 

B. Cloning the Duplex Fragments 

The duplex cDNA fragments were blunt-ended 
with T4 DNA polymerase under standard conditions 
(Maniatis, p. 118), then extracted with 
phenol/chloroform and precipitated with ethanol. The 
blunt-ended material was ligated with EcoRI linkers 
under standard conditions (Maniatis, pp. 396-397) and 
digested with EcoRI to remove redundant linker ends. 
Non-ligated linkers were removed by sequential 
isopropanol precipitation. 

Lambda gtlO phage vector (Huynh) was 
obtained from Promega Biotec (Madison, WI ) . This 
cloning vector has a unique EcoRI cloning site in the 
phage CI repressor gene. The cDNA fragments from above 
were introduced into the EcoRI site by mixing 0.5 - 
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EcoRI-cleaved gtlO, 0.5-3 



1 . O^g EcoRI-cleaved gtlO, 0.5-3 /IT of the above 
duplex fragments, 0.5 pi 10X ligation buffer, 0.5 pi 
ligase ( 200 units), and distilled water to 5 /il. The 
mixture was incubated overnight at 14 °C, followed by 
5 iH vitro packaging, according to standard methods 
(Maniatis, pp. 256-268). 

The packaged phage were used to infect an E . 
coli hfl strain, such as strain HG415. Alternatively, 
E . coli , strain C600 hfl available from Promega 
10 Biotec, Madison, WI, could be used. The percentage of 
recombinant plaques obtained with insertion of the 
EcoRI-ended fragments was less than 5% by analysis of 
20 random plaques. 

The resultant cDNA library was plated and 
15 phage were eluted from the selection plates by 

addition of elution buffer. After DNA extraction from 
the phage, the DNA was digested with EcoRI to release 
the heterogeneous insert population, and the DNA 
fragments were fractionated on agarose to remove phage 
20 fragments. The 500-4,000 basepair inserts were 

isolated and recloned into lambda gtlO as above, and 
the packaged phage was used to infect E^ coli strain 
HG415. The percentage of successful recombinants was 
greater than 95%. The phage library was plated on E^ 
25 coli strain HG415, at about 5,000 plaques/plate, on a 
total of 8 plates. 



Example 2_ 
Selecting ET-NANB Cloned Fragments 
30 A. cDNA Probes 

Duplex cDNA fragments from noninfected and 
ETNANB-inf ected cynomolgus monkeys were prepared as in 
Example 1. The cDNA fragments were radiolabeled by 
random priming, using a random-priming labeling kit 
35 obtained from Boehringer-Mannheim (Indianapolis, IN). 



B. Clone Selection 
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The plated cDNA library from Example 1 was 
transferred to each of two nitrocellulose filters, and 
the phage DNA was fixed on the filters by baking, 
according to standard methods (Maniatis, pp. 320323). 
5 The duplicate filters were hybridized with either 
infected-source or control CDNA probes from above. 
Autoradiography of the filters were examined to 
identify library clones which hybridized with 
radiolabeled CDNA probes from infected source only, 

10 i.e., did not hybridize with c DNA probes from the non- 
infected source. Sixteen such clones, out of a total 
of about 40,000 clones examined, were identified by 
this subtraction selection method. 

Each of the sixteen clones was picked and 

15 replated at low concentration on an agar plate. The 
clones on each plate were transferred to two nitro- 
cellulose ag duplicate lifts, and examined for hybrid- 
ization to radiolabeled cDNA probes from infected and 
noninfected sources, as above. Clones were selected 

20 which showed selective binding for infected-source 

probes (i.e., binding with infected-source probes and 
substantially no binding with non-infected-source 
probes). One of the clones which bound selectively to 
probe from infected source was isolated for further 

25 study. The selected vector was identified as lambda 
gtlO-1.1, indicated in Figure 1. 

Example _3 
ET-NANB Sequence 

30 Clone lambda gtlO-1.1 from Example 2 was digested 

with EcoRI to release the heterologous insert, which 
was separated from the vector fragments by gel 
electrophoresis. The elec trophoret ic mobility of the 
fragment was consistent with a 1.33 kb fragment. This 

35 fragment, which contained EcoRI ends, was inserted 
into the EcoRI site of a pTZKFl vector, whose 
construction and properties are described in co-owned 
U.S. patent application for "Cloning Vector System and 
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Method for Rare Clone Identification", Serial No. 125, 
650, filed November 25, 1987. Briefly, and as 
illustrated in Figure 1, this piasmid contains a 
unique EcoRI site adjacent a T7 polymerase promoter 
5 site, and piasmid and phage origins of replication. 
The sequence immediately adjacent each side of the 
EcoRI site is known. E. coli BB4 bacteria, obtained 
from Stratagene (La Jolla, CA, were transformed with 
the piasmid. 

10 Radiolabeled ET-NANB probe was prepared by 

excising the 1.33 kb insert from the lambda gtlO-1.1 
phage in Example 2, separating the fragment by gel 
electrophoresis, and randomly labeling as above. 
Bacteria trans fected with the above pTZKFl and 

15 containing the desired ET-NANB insert were selected by 
replica lift and hybridization with the radiolabeled 
ET-NANB probe, according to methods outlined in 
Example 2 . 

One bacterial colony containing a 
20 successful recombinant was used for sequencing a 
portion of the 1.33 kb insert. This isolate, 
designated pTZKFl ( ET1 . 1 ) , has been deposited with the 
American Type Culture Collection, and is identified by 
ATCC deposit no. 67717. Using a standard dideoxy 
25 sequencing procedure, and primers for the sequences 
flanking the EcoRI site, about 200-250 basepairs of 
sequence from the 5 ' -end region and 3 ' -end region of 
the insert were obtained. The sequences are given 
above in Section II. Later sequencing by the same 
30 techniques gave the full sequence in both directions, 
also given above. 

Example 4_ 
Detecting ET-NANB Sequences 
35 cDNA fragment mixtures from the bile of 

noninfected and ET-NANB-inf ected cynomolgus monkeys 
were prepared as above. The cDNA fragments obtained 
from human stool samples were prepared as follows. 
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Thi^^m! of a L ~ > st.:ol s sp - - : -^Kb :a ined from an 
individual from \'e xi„-- : : ^ s- : -15 infected with ET- 
NANB as a result o r" ar M-.-^ak, and a similar 

volume of stool fr^m -1 ho ^ 1. - h \ , infected 
5 individual, were Iv/ere-: ^:~r a 1 v* sucrose density- 
gradient cushion, oer- 1 r i fe -00 a*. 25,000 x g for 6 
hr in an SW27 rot::, at 11 : d. To- pelleted material 
from the in f ec ted - .a e o r ee ^:.;o ■ otntained 27-34 nm VLP 
particles characteristic of ET-NAMS infection in the 

10 infected-stool sample. PNA vas isolated from the 

sucrose-gradient pellets in both the infected and non- 
infected samples, and the isclated RNA was used to 
produce cDNA fragments as lescnbed in Example 1. 

The CDNA fragment fixtures from infected and 

15 non-infected bile source, aivl from infected and non- 
infected human-steel source were ^aoh amplified by a 
novel linker /prime r replieati.cn method described in 
co-owned patent applieatv— serial number 07/208,512 
for "DNA Amplification and - o: • t raction Technique, M 

2 0 filed June 17, 19 £«. ^:^'v, -e:e fragments in each 

sample were blunt-en:iee •.■h.Lh "t'A -o L I then extracted 
with phenol/chloroform and precipitated with ethanol . 
The blunt-ended material was iigated with linkers 
having the following sequence f top or 5' sequence has 
25 SEQ ID NO .21; bottom or 3 ' sequence has SEQ ID NO:22): 

5 ' -GGAATTCGCGGCCGCTCG-3 ' 
3 ' -TTCCTTAAGCGCCGGCGAGC- 5 ' 

The duplex fragments were digested with 

3 0 Nrul to remove linker iimers, mixed with a primer 

having the sequence 2 ' - 1 ddToPZ 11GCTCG - 3 ' , and then 

heat denatured and eeeled t^ room temperature to form 
single-strand DNA,'pr:mer complexes. The complexes were 
replicated to form duplex fragments by addition of 
35 Thermus aquaticus ;Taq' p^'y^e^ase and ail four 
deoxynucleotides . The rerlioati^n procedures, 
involving successive =:m"1 de-ate rat ion , formation of 
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strand/primer complexes, and replication, was repeated 
2 5 times . 

The amplified cDNA sequences were 
fractionated by agarose gel electrophoresis, using a 
2% agarose matrix. After transfer of the DNA fragments 
from the agarose gels to nitrocellulose paper, the 
filters were hybridized to a random- labeled 32p probe 
prepared by (i) treating the pTZKF 1 ( ET1 . 1 ) plasmid 
from above with EcoRI, (ii) isolating the released 
1.33 kb ET-NANB fragment, and (iii) randomly labeling 
the isolated fragment. The probe hybridization wag 
performed by conventional Southern blotting methods 
(Maniatis, pp. 382-389). Figure 2 shows the 
hybridization pattern obtained with cDNAs from 
infected (I) and non-infected (N) bile sources (2A) 
and from infected (I) and noninfected (N) human stool 
sources (2B). As seen, the ET-NANB probe hybridized 
with fragments obtained from both of the infected 
sources, but was non-homologous to sequences obtained 
from either of the non-infected sources, thus 
confirming the specificity of derived sequence. 

Southern blots of the radiolabeled 1.33 kb 
fragment with genomic DNA fragments from both human 
and cynomolgus -monkey DNA were also prepared. No 
probe hybridization to either of the genomic fragment 
mixtures was observed, confirming that the ET-NANB 
sequence is exogenous to either human or cynomolgus 
genome . 

Example 5 
Expressing ET-NANB Proteins 
A. Preparing ET-NANB Coding Sequences 

The pTZKFl(ETl.l) plasmid from Example 2 
was digested with EcoRI to release the 1.33 kb ET-NANB 
insert which was purified from the linearized plasmid 
by gel electrophoresis. The purified fragment was 
suspended in a standard digest buffer (0.5M Tris HC1, 
pH 7.5; 1 rag/ml BSA; lOm^r MnC12) to a concentration of 
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about 1 mg/ml and digested with DNAse I at room 
temperature for about 5 minutes . These reaction 
conditions were determined from a prior calibration 
study, in which the incubation time required to 
5 produce predominantly 100-300 basepair fragments was 
determined. The material was extracted with 
phenol/chloroform before ethanol precipitation. 

The fragments in the digest mixture were 
blunt-ended and ligated with EcoRI linkers as in 

10 Example 1. The resultant fragments were analyzed by 

electrophoresis (5-lOV/cm) on 1.2% agarose gel, using 
PhiX174/HaeIII and lambda/Hindlll size markers. The 
100-300 bp fraction was eluted onto NA45 strips 
(Schleicher and Schueil), which were then placed into 

15 1 . 5 ml microtubes with eiuting solution (1 M NaCl, 50 
mM arginine, pH 9.0), and incubated at 67°C for 30-60 
minutes. The eluted DNA was phenol/chloroform 
extracted and then precipitated with two volumes of 
ethanol. The pellet was resuspended in 20 pi TE (0.01 

20 M Tris HC1, pH 7.5, 0.001 M EDTA) . 

B. Cloning in an Expression Vector 

Lambda gtll phage vector (Huynh) was 
obtained from Promega Biotec (Madison, WI). This 

25 cloning vector has a unique EcoRI cloning site 53 base 
pairs upstream from the beta-galactos idase translation 
termination codon. The genomic fragments from above, 
provided either directly from coding sequences 
(Example 5) or after amplification of cDNA (Example 

30 4), were introduced into the EcoRI site by mixing 0.5- 
1 . 0 pg EcoRI-cleaved gtll, 0.3-3 pi of the above sized 
fragments, 0.5 pi 10X ligation buffer (above), 0.5 pi 
ligase (200 units), and distilled water to 5 pi. The 
mixture was incubated overnight at 14°C, followed by 

35 in vitro packaging, according to standard methods 
(Maniatis, pp. 256-268). 

The packaged phage were used to infect E . 
coli strain KM3 92, obtained from Dr. Kevin Moore, DNAX 
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(Palo Alto, CA) . Alternatively, Coli strain Y1090, 
available from the American Type Culture Collection 

(ATCC #37197), could be used. The infected bacteria 
were plated and the resultant colonies were checked 
for loss of beta-galactosidase activity- ( clear 
plaques) in the presence of X-gal using a standard X- 
gal substrate plaque assay method (Maniatis). About 
50% of the phage plaques showed loss of beta- 
galactosidase enzyme activity (recombinants). 

C. Screening for ET-NANB Recombinant Proteins 

ET-NANB convalescent antiserum was obtained 
from patients infected during documented ET-NANB 
outbreaks in Mexico, Borneo, Pakistan, Somalia, and 
Burma. The sera were immunoreactive with VLPs in stool 
specimens from each of several other patients with ET- 
NANB hepatitis . 

A lawn of E^ coli KM3 9 2 cells infected with 
about 104 pfu of the phage stock from above was 
prepared on a 150 mm plate and incubated, inverted, 
for 5-8 hours at 37 °C. The lawn was overlaid with a 
nitrocellulose sheet, causing transfer of expressed 
ETNANB recombinant protein from the plaques to the 
paper. The plate and filter were indexed for matching 
corresponding plate and filter positions. 

The filter was washed twice in TBST buffer 
(10 mM Tris, pH 8.0, 150 mK NaCl, 0.05% Tween 2 Of, 
blocked with AIB (TBST buffer with 1% gelatin), washed 
again in TBST, and incubated overnight after addition 
of antiserum (diluted to 1:50 in AIB, 12-15 ml/plate). 
The sheet was washed twice in TBST and then contacted 
with enzyme-labeled anti-human antibody to attach the 
labeled antibody at filter sites containing antigen 
recognized by the antiserum. After a final washing, 
the filter was developed in a substrate medium 
containing 33 /*1 NBT (50 mg/ml stock solution 
maintained at 4°C) mixed with 16 pi BCIP (50 mg/ml 
stock solution maintained at 4°C) in 5 ml of alkaline 
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phosphatase buffer (ICO mM Tris, 9.5, 100 mM NaCI, 5 
mM MgC12). Purple color appeared at points of antigen 
production, as recognized by the antiserum. 

D. Screening Plating 

The areas of antigen production determined 
in the previous step were repiated at about 100-200 
pfu on an 82 nun plate. The above steps, beginning with 
a 5-8 hour incubation, through NBT-BCIP development, 
were repeated in order to plaque purify phage 
secreting an antigen capable of reacting with the ET- 
NANB antibody. The identified plaques were picked and 
eluted in phage buffer (Maniatis, p. 443). 

15 E. Epitope Identification 

A series of subclones derived from the 
original pTZKFl (ET1.1) plasmid from Example 2 were 
isolated using the same techniques described above. 
Each of these five subclones were immunoreactive with 

20 a pool of anti-ET antisera noted in C. The subclones 
contained short sequences from the "reverse" sequence 
set forth previously. The beginning and ending points 
of the sequences in the subclones (relative to the 
full "reverse" sequence), are identified in the table 

25 below. 
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TABLE 1 



Subclone Pos ition in "Reverse " Sequence 



5 '-end 3 ' -end 

Yl 522 643 

Y2 594 667 

Y3 508 665 

Y4 558 752 

Y5 545 665 



Since all of the gene sequences identified 
in the table must contain the coding sequence for the 
epitope, it is apparent that the coding sequence for 
the epitope falls in the region between nucleotide 594 
(5 '-end) and 64 3 (3 '-end). Genetic sequences 
equivalent to and complementary to this relatively 
short sequence are therefore particularly preferred 
aspects of the present invention, as are peptides 
produced using this coding region. 

A second series of clones identifying an 
altogether different epitope was isolated with only 
Mexican serum. 





TABLE 2 




Subclone 


Position in 


" Forward " Sequence 




5 'end 


3' end 


ET 2-2 


2 


193 


ET 8-3 


2 


135 


ET 9-1 


2 


109 


ET 13-1 


2 


101 
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The coding system for this epitope falls 
between nucleotide 2 (S -end) and 101 (3 -end). 
Genetic sequences related to this short sequence are 
therefore also preferred, as are peptides produced 
using this coding region. 

Two particularly preferred subclones for use 
in preparing polypeptides containing epitopes specific 
for HEV are the 406.3-2 and 406.4-2 clones whose 
sequences are set forth above. These sequences were 
isolated from an amplified cDNA library derived from a 
Mexican stool. Using the techniques described in this 
section, polypeptides expressed by these clones have 
been tested for immunoreact ivity against a number of 
different human HEV-positive sera obtained from 
15 sources around the world. As shown in Table 3 below, 8 
sera immunoreactive with the polypeptide expressed by 
the 406.4-2, and 6 sera immunoreacted with polypeptide 
expressed by the 406.3-2 clone. 

For comparison, the Table also shows 
reactivity of the various human sera with the Y2 clone 
identified in Table 1 above. Only one of the sera 
reacted with the polypeptide expressed by this clone. 
No immunoreactivity was seen for normal expression 
products of the gtll vector. 



20 
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Table 3 

Immunoreactivity of HEV Recombinant 
Proteins: Human Sera 

30 Sera Source Stagel 406.3-2 406.4-2 Y2 Xgtll 





FVH-21 


Burma 


A 












FVH-8 


Burma 


A 






+ 




35 


SOM-19 


Somalia 


A 












SOM-20 


Somal ia 


A 


+ 










IM-35 


Borneo 


A 












IM-36 


Borneo 


A 












PAX-1 


Pakistan 


A 




4- 






40 


FFI-4 


Mexico 


A 
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SI 



FFI-125 Mexico A + 

F 387 IC Mexico C + ND 

Normal U.S.A. 



5 1a = acute; C = convalescent 

While the invention has been described with 
reference to particular embodiments, methods, 
construction and use, it will be apparent to those 
skilled in the art that various changes and 

0 modifications can be made without departing from the 
invention . 
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